Performance of Parallel K-Means Algorithms in Java']Java

被引:6
|
作者
Nigro, Libero [1 ]
机构
[1] Univ Calabria, Engn Dept Informat Modelling Elect & Syst Sci DIM, I-87036 Arcavacata Di Rende, Italy
关键词
parallel algorithms; multi-core machines; K-means clustering; !text type='Java']Java[!/text; functional parallel streams; actors; message-passing; lightweight parallel programming;
D O I
10.3390/a15040117
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
K-means is a well-known clustering algorithm often used for its simplicity and potential efficiency. Its properties and limitations have been investigated by many works reported in the literature. K-means, though, suffers from computational problems when dealing with large datasets with many dimensions and great number of clusters. Therefore, many authors have proposed and experimented different techniques for the parallel execution of K-means. This paper describes a novel approach to parallel K-means which, today, is based on commodity multicore machines with shared memory. Two reference implementations in Java are developed and their performances are compared. The first one is structured according to a map/reduce schema that leverages the built-in multi-threaded concurrency automatically provided by Java to parallel streams. The second one, allocated on the available cores, exploits the parallel programming model of the Theatre actor system, which is control-based, totally lock-free, and purposely relies on threads as coarse-grain "programming-in-the-large" units. The experimental results confirm that some good execution performance can be achieved through the implicit and intuitive use of Java concurrency in parallel streams. However, better execution performance can be guaranteed by the modular Theatre implementation which proves more adequate for an exploitation of the computational resources.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] A Comparative Performance Analysis of Fast K-Means Clustering Algorithms
    Beecks, Christian
    Berns, Fabian
    Huewel, Jan David
    Linxen, Andrea
    Schlake, Georg Stefan
    Duesterhus, Tim
    [J]. INFORMATION INTEGRATION AND WEB INTELLIGENCE, IIWAS 2022, 2022, 13635 : 119 - 125
  • [22] Communication performance of Java']Java-based parallel virtual machines
    Yalamanchilli, N
    Cohen, W
    [J]. CONCURRENCY-PRACTICE AND EXPERIENCE, 1998, 10 (11-13): : 1189 - 1196
  • [23] Parallel Theatre: An actor framework in Java']Java for high performance computing
    Nigro, Libero
    [J]. SIMULATION MODELLING PRACTICE AND THEORY, 2021, 106
  • [24] A Modified K-means Algorithms - Bi-Level K-Means Algorithm
    Yu, Shyr-Shen
    Chu, Shao-Wei
    Wang, Ching-Lin
    Chan, Yung-Kuan
    Chuang, Chia-Yi
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON SOFT COMPUTING IN INFORMATION COMMUNICATION TECHNOLOGY, 2014, : 10 - 13
  • [25] Empirical Evaluation of K-Means, Bisecting K-Means, Fuzzy C-Means and Genetic K-Means Clustering Algorithms
    Banerjee, Shreya
    Choudhary, Ankit
    Pal, Somnath
    [J]. 2015 IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE), 2015, : 172 - 176
  • [26] Performance evaluation of Java/PCJ implementation of parallel algorithms on the cloud (extended version)
    Nowicki, Marek
    Górski, Lukasz
    Bala, Piotr
    [J]. Concurrency and Computation: Practice and Experience, 2023, 35 (15):
  • [27] HPJava']Java:: data parallel extensions to Java']Java
    Carpenter, B
    Zhang, GS
    Fox, G
    Li, XY
    Wen, YH
    [J]. CONCURRENCY-PRACTICE AND EXPERIENCE, 1998, 10 (11-13): : 873 - 877
  • [28] K-Java']Java: A Complete Semantics of Java']Java
    Bogdanas, Denis
    Rosu, Grigore
    [J]. ACM SIGPLAN NOTICES, 2015, 50 (01) : 445 - 456
  • [29] K-means algorithms for functional data
    Lopez Garcia, Maria Luz
    Garcia-Rodenas, Ricardo
    Gonzalez Gomez, Antonia
    [J]. NEUROCOMPUTING, 2015, 151 : 231 - 245
  • [30] A note on constrained k-means algorithms
    Ng, MK
    [J]. PATTERN RECOGNITION, 2000, 33 (03) : 515 - 519