MapReduce-based fuzzy c-means clustering algorithm: implementation and scalability

被引:54
|
作者
Ludwig, Simone A. [1 ]
机构
[1] N Dakota State Univ, Dept Comp Sci, Fargo, ND 58105 USA
关键词
MapReduce; Hadoop; Scalability;
D O I
10.1007/s13042-015-0367-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The management and analysis of big data has been identified as one of the most important emerging needs in recent years. This is because of the sheer volume and increasing complexity of data being created or collected. Current clustering algorithms can not handle big data, and therefore, scalable solutions are necessary. Since fuzzy clustering algorithms have shown to outperform hard clustering approaches in terms of accuracy, this paper investigates the parallelization and scalability of a common and effective fuzzy clustering algorithm named fuzzy c-means (FCM) algorithm. The algorithm is parallelized using the MapReduce paradigm outlining how the Map and Reduce primitives are implemented. A validity analysis is conducted in order to show that the implementation works correctly achieving competitive purity results compared to state-of-the art clustering algorithms. Furthermore, a scalability analysis is conducted to demonstrate the performance of the parallel FCM implementation with increasing number of computing nodes used.
引用
收藏
页码:923 / 934
页数:12
相关论文
共 50 条
  • [21] Clustering algorithm in vehicular communication based on Fuzzy C-Means
    Zhao, Haitao
    He, Chen
    Cheng, Huiling
    Ren, Xiang
    Zhu, Xuanpei
    Zhu, Hongbo
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TW), 2019,
  • [22] Suppressed fuzzy C-means clustering algorithm
    Fan, JL
    Zhen, WZ
    Xie, WX
    [J]. PATTERN RECOGNITION LETTERS, 2003, 24 (9-10) : 1607 - 1612
  • [23] An Accelerated Fuzzy C-Means clustering algorithm
    Hershfinkel, D
    Dinstein, I
    [J]. APPLICATIONS OF FUZZY LOGIC TECHNOLOGY III, 1996, 2761 : 41 - 52
  • [24] An Image Segmentation Algorithm Based on Fuzzy C-Means Clustering
    Zhang, Xin-bo
    Jiang, Li
    [J]. ICDIP 2009: INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING, PROCEEDINGS, 2009, : 22 - 26
  • [25] Clonal Selection based Fuzzy C-Means Algorithm for Clustering
    Ludwig, Simone A.
    [J]. GECCO'14: PROCEEDINGS OF THE 2014 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2014, : 105 - 112
  • [26] An ordered clustering algorithm based on fuzzy c-means and PROMETHEE
    Bai, Chengzu
    Zhang, Ren
    Qian, Longxia
    Liu, Lijun
    Wu, Yaning
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (06) : 1423 - 1436
  • [27] An ordered clustering algorithm based on fuzzy c-means and PROMETHEE
    Chengzu Bai
    Ren Zhang
    Longxia Qian
    Lijun Liu
    Yaning Wu
    [J]. International Journal of Machine Learning and Cybernetics, 2019, 10 : 1423 - 1436
  • [28] Soil clustering by fuzzy c-means algorithm
    Goktepe, AB
    Altun, S
    Sezer, A
    [J]. ADVANCES IN ENGINEERING SOFTWARE, 2005, 36 (10) : 691 - 698
  • [29] Acceleration and Scalability for c-Means Clustering
    Bezdek, James
    [J]. IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 31 - 32
  • [30] A massive images classification method based on MapReduce parallel fuzzy C-means clustering
    Hu, Jinping
    Cheng, Qian
    Wen, Zhicheng
    [J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2021, 21 (04) : 999 - 1011