Efficient and Reliable Clustering by Parallel Random Swap Algorithm

被引:2
|
作者
Nigro, Libero [1 ]
Cicirelli, Franco [2 ]
Franti, Pasi [3 ]
机构
[1] Univ Calabria, DIMES Dept Informat Modelling Elect & Syst Sci, I-87036 Arcavacata Di Rende, Italy
[2] Natl Res Council Italy, CNR, Inst High Performance Comp & Networking ICAR, I-87036 Arcavacata Di Rende, Italy
[3] Univ Eastern Finland, Sch Comp, Machine Learning Grp, POB 111, Joensuu 80101, Finland
关键词
Clustering problem; K-Means; Random swap; Parallelism; Streams; Lambda Expressions; !text type='Java']Java[!/text; K-MEANS;
D O I
10.1109/DS-RT55542.2022.9932090
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Solving large-scale clustering problems requires an efficient algorithm which can be implemented also in parallel. Kmeans would be suitable but it can lead to an inaccurate clustering result. To overcome this problem, we present a parallel version of random swap clustering algorithm. It combines the scalability of k-means with high clustering accuracy. The new clustering method is experimented on top of Java parallel streams and lambda expressions, which offer interesting execution time benefits. The method is applied to standard benchmark datasets, with a varying population size and distribution of managed records, dimensionality of data points and the number of clusters. The experimental results confirm that high quality clustering can be obtained by parallel random swap together with a high time efficiency.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Parallel random swap: An efficient and reliable clustering algorithm in java']java
    Nigro, Libero
    Cicirelli, Franco
    Fra, Pasi
    SIMULATION MODELLING PRACTICE AND THEORY, 2023, 124
  • [2] Probabilistic Clustering by Random Swap Algorithm
    Franti, Pasi
    Virmajoki, Olli
    Hautamaki, Ville
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2400 - 2403
  • [3] Distributed random swap: An efficient algorithm for minimum sum-of-squares clustering
    Kozbagarov, Olzhas
    Mussabayev, Rustam
    INFORMATION SCIENCES, 2024, 681
  • [4] Centroid Ratio for a Pairwise Random Swap Clustering Algorithm
    Zhao, Qinpei
    Franti, Pasi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (05) : 1090 - 1101
  • [5] Performance Improvement of Clustering Method Based on Random Swap Algorithm
    Yu, Sunjin
    Yoon, Changyong
    INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2019, 19 (02) : 97 - 102
  • [6] Efficiency of random swap clustering
    Fränti P.
    Journal of Big Data, 5 (1)
  • [7] AN EFFICIENT PARALLEL ALGORITHM FOR RANDOM SAMPLING
    RAJAN, VY
    GHOSH, RK
    GUPTA, P
    INFORMATION PROCESSING LETTERS, 1989, 30 (05) : 265 - 268
  • [8] An efficient clustering algorithm for partitioning parallel programs
    Maheshwari, P
    Shen, H
    PARALLEL COMPUTING, 1998, 24 (5-6) : 893 - 909
  • [9] Parallel unsupervised k-windows:: An efficient parallel clustering algorithm
    Tasoulis, DK
    Alevizos, P
    Boutsinas, B
    Vrahatis, MN
    PARALLEL COMPUTING TECHNOLOGIES, PROCEEDINGS, 2003, 2763 : 336 - 344
  • [10] An efficient parallel direction-based clustering algorithm
    Zhong, Kai
    Zhou, Xu
    Zhou, Liqian
    Yang, Zhibang
    Liu, Chubo
    Xiao, Na
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 145 : 24 - 33