Parallel random swap: An efficient and reliable clustering algorithm in java']java

被引:6
|
作者
Nigro, Libero [1 ]
Cicirelli, Franco [2 ]
Fra, Pasi [3 ]
机构
[1] Univ Calabria, DIMES Dept Informat Modelling Elect & Syst Sci, I-87036 Arcavacata Di Rende, Italy
[2] CNR Natl Res Council Italy, Inst High Performance Comp & Networking ICAR Rende, I-87036 Arcavacata Di Rende, Italy
[3] Univ Eastern Finland, Sch Comp, Machine Learning Grp, POB 111, Joensuu 80101, Finland
关键词
Clustering problem; K; -means; Random swap; Parallelism; !text type='Java']Java[!/text; Streams; Lambda expressions; Actors; Multi -core machines; K-MEANS ALGORITHM; OPTIMIZATION;
D O I
10.1016/j.simpat.2022.102712
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Solving large-scale clustering problems requires an efficient algorithm that can also be implemented in parallel. K-means would be suitable, but it can lead to an inaccurate clustering result. To overcome this problem, we present a parallel version of the random swap clustering algorithm. It combines the scalability of k-means with the high clustering accuracy of random swap. The algorithm is implemented in Java in two ways. The first implementation uses Java parallel streams and lambda expressions. The solution exploits a built-in multi-threaded organization capable of offering competitive speedup. The second implementation is achieved on top of the Theatre actor system which ensures better scalability and high-performance computing through fine-grain resource control. The two implementations are then applied to standard benchmark datasets, with a varying population size and distribution of managed records, dimensionality of data points and the number of clusters. The experimental results confirm that high-quality clustering can be obtained together with a very good execution efficiency. Our Java code is publicly available at: https://github.com/uef-machine-learning.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Efficient and Reliable Clustering by Parallel Random Swap Algorithm
    Nigro, Libero
    Cicirelli, Franco
    Franti, Pasi
    2022 IEEE/ACM 26TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED SIMULATION AND REAL TIME APPLICATIONS (DS-RT), 2022,
  • [2] Efficient Java']Java RMI for parallel programming
    Maassen, J
    Van Nieuwpoort, R
    Veldema, R
    Bal, H
    Kielmann, T
    Jacobs, C
    Hofman, R
    ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2001, 23 (06): : 747 - 775
  • [3] Probabilistic Clustering by Random Swap Algorithm
    Franti, Pasi
    Virmajoki, Olli
    Hautamaki, Ville
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2400 - 2403
  • [4] Satin: Efficient parallel divide-and-conquer in Java']Java
    van Nieuwpoort, RV
    Kielmann, T
    Bal, HE
    EURO-PAR 2000 PARALLEL PROCESSING, PROCEEDINGS, 2000, 1900 : 690 - 699
  • [5] An Efficient Garbage Collection in Java']Java Virtual Machine via Swap I/O Optimization
    Lee, Hyojeong
    Chen, Qichen
    Yeom, Heon Young
    Son, Yongseok
    PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 1238 - 1245
  • [6] plapackJava']Java:: Towards an efficient Java']Java interface for high performance parallel linear algebra
    Gamess, E
    INFORMATION PROCESSING LETTERS, 2000, 75 (05) : 191 - 197
  • [7] An Efficient, Parametric Fixpoint Algorithm for Analysis of Java']Java Bytecode
    Mendez, Mario
    Navas, Jorge
    Hermenegildo, Manuel V.
    ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2007, 190 (01) : 51 - 66
  • [8] Towards Efficient Support for Parallel I/O in Java']Java HPC
    Awan, Ammar Ahmad
    Ayub, Muhammad Sohaib
    Shafi, Aamir
    Lee, Sungyoung
    2012 13TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS, AND TECHNOLOGIES (PDCAT 2012), 2012, : 137 - 143
  • [9] Distributed random swap: An efficient algorithm for minimum sum-of-squares clustering
    Kozbagarov, Olzhas
    Mussabayev, Rustam
    INFORMATION SCIENCES, 2024, 681
  • [10] An efficient optimization algorithm of Java']Java bytecode to reduce network traffic
    Kim, DW
    Jung, MS
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2003, PT 2, PROCEEDINGS, 2003, 2668 : 542 - 551