Parallel WaveCluster: A linear scaling parallel clustering algorithm implementation with application to very large datasets

被引:5
|
作者
Yildirim, Ahmet Artu [1 ]
Ozdogan, Cem [1 ]
机构
[1] Cankaya Univ, Dept Comp Engn, TR-06530 Ankara, Turkey
关键词
Cluster analysis; WaveCluster algorithm; Parallel WaveCluster; SPATIAL DATA;
D O I
10.1016/j.jpdc.2011.03.007
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A linear scaling parallel clustering algorithm implementation and its application to very large datasets for cluster analysis is reported. WaveCluster is a novel clustering approach based on wavelet transforms. Despite this approach has an ability to detect clusters of arbitrary shapes in an efficient way, it requires considerable amount of time to collect results for large sizes of multi-dimensional datasets. We propose the parallel implementation of the WaveCluster algorithm based on the message passing model for a distributed-memory multiprocessor system. In the proposed method, communication among processors and memory requirements are kept at minimum to achieve high efficiency. We have conducted the experiments on a dense dataset and a sparse dataset to measure the algorithm behavior appropriately. Our results obtained from performed experiments demonstrate that developed parallel WaveCluster algorithm exposes high speedup and scales linearly with the increasing number of processors. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:955 / 962
页数:8
相关论文
共 50 条
  • [1] POFCM: A Parallel Fuzzy Clustering Algorithm for Large Datasets
    Perez-Ortega, Joaquin
    Rey-Figueroa, Cesar David
    Roblero-Aguilar, Sandra Silvia
    Almanza-Ortega, Nelva Nely
    Zavala-Diaz, Crispin
    Garcia-Paredes, Salomon
    Landero-Najera, Vanesa
    [J]. MATHEMATICS, 2023, 11 (08)
  • [2] Scaling properties of a parallel implementation of the multicanonical algorithm
    Zierenberg, Johannes
    Marenz, Martin
    Janke, Wolfhard
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 2013, 184 (04) : 1155 - 1160
  • [3] PARALLEL IMPLEMENTATION OF A LINEAR PREDICTION ALGORITHM
    QIAO, S
    [J]. ADVANCED ALGORITHMS AND ARCHITECTURES FOR SIGNAL PROCESSING IV, 1989, 1152 : 338 - 345
  • [4] Parallel Variational Bayes for Large Datasets With an Application to Generalized Linear Mixed Models
    Minh-Ngoc Tran
    Nan, David J.
    Kuk, Anthony Y. C.
    Kohn, Robert
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2016, 25 (02) : 626 - 646
  • [5] Parallel implementation of fuzzy minimals clustering algorithm
    Timon, Isabel
    Soto, Jesus
    Perez-Sanchez, Horacio
    Cecilia, Jose M.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2016, 48 : 35 - 41
  • [6] Parallel social spider clustering algorithm for high dimensional datasets
    Shukla, Urvashi Prakash
    Nanda, Satyasai Jagannath
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 56 : 75 - 90
  • [7] A Parallel Algorithm to Induce Decision Trees for Large Datasets
    Franco-Arcega, A.
    Suarez-Cansino, J.
    Flores-Flores, L. G.
    [J]. 2013 XXIV INTERNATIONAL SYMPOSIUM ON INFORMATION, COMMUNICATION AND AUTOMATION TECHNOLOGIES (ICAT), 2013,
  • [8] A parallel Kohonen algorithm for the classification of large spatial datasets
    Openshaw, S
    Turton, I
    [J]. COMPUTERS & GEOSCIENCES, 1996, 22 (09) : 1019 - 1026
  • [9] WINP: A window-based incremental and parallel clustering algorithm for very large databases
    Qiang, Z
    Zheng, Z
    Wei, SZ
    Daley, E
    [J]. ICTAI 2005: 17TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, : 169 - 176
  • [10] Parallel implementation of a linear scaling divide and conquer semiempirical molecular orbital algorithm.
    Vincent, J
    Dixon, S
    Merz, KM
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1997, 214 : 112 - COMP