A Parallel K-Medoids Algorithm for Clustering based on MapReduce

被引:0
|
作者
Shafiq, M. Omair [1 ]
Torunski, Eric [1 ]
机构
[1] Carleton Univ, Sch Informat Technol, Ottawa, ON, Canada
来源
2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016) | 2016年
关键词
Clustering; K-Medoids; Big Data; MapReduce;
D O I
10.1109/ICMLA.2016.196
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the most important machine learning techniques include clustering of data into different clusters or categories. There are several decent algorithms and techniques that exist to perform clustering on small to medium scale data. In the era of Big Data and with applications being large-scale and data-intensive in nature, there is a significant increment in volume, variety and velocity of data in the form of log events produced by such applications. This makes the task of clustering of huge amounts of data more challenging and limited. In this paper, we present our approach of a parallel K-Medoids clustering algorithm based on MapReduce paradigm to be able to perform clustering on large-scale of data. We have kept our solution simple and feasible to be used to handle huge volume, variety and velocity of data. Another key uniqueness in our proposed algorithm is that it can achieve parallelism independent of the number of k clusters to be formed, unlike other related approaches. We have tested our algorithm on large amounts of data and on a real-life case-study.
引用
收藏
页码:502 / 507
页数:6
相关论文
共 50 条
  • [31] Evaluation of bearing performance degradation based on MMFE and extensible k-medoids clustering algorithm
    Zhao C.
    Liu Y.
    Zhao Y.
    Bai Y.
    Shi J.
    Zhendong yu Chongji/Journal of Vibration and Shock, 2022, 41 (17): : 123 - 130+159
  • [32] Fault detection of continuous glucose measurements based on modified k-medoids clustering algorithm
    Yu, Xia
    Sun, Xiaoyu
    Zhao, Yuhang
    Liu, Jianchang
    Li, Hongru
    NEURAL COMPUTING & APPLICATIONS, 2020,
  • [33] Active Distance-Based Clustering Using K-Medoids
    Aghaee, Amin
    Ghadiri, Mehrdad
    Baghshah, Mahdieh Soleymani
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT I, 2016, 9651 : 253 - 264
  • [34] Advancing the PAM Algorithm to Semi-supervised k-Medoids Clustering
    Janosova, Miriama
    Lang, Andreas
    Budikova, Petra
    Schubert, Erich
    Dohnal, Vlastislav
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2024, 2025, 15268 : 223 - 237
  • [35] Application of the k-medoids Partitioning Algorithm for Clustering of Time Series Data
    Radovanovic, Ana
    Ye, Xinlin
    Milanovic, Jovica, V
    Milosavljevic, Nina
    Storchi, Riccardo
    2020 IEEE PES INNOVATIVE SMART GRID TECHNOLOGIES EUROPE (ISGT-EUROPE 2020): SMART GRIDS: KEY ENABLERS OF A GREEN POWER SYSTEM, 2020, : 645 - 649
  • [36] Study of Optimizing Combined-blowing in EAF based on K-medoids Clustering Algorithm
    Yang, Lingzhi
    Zhu, Rong
    Dong, Kai
    Liu, Wenjuan
    Ma, Guohong
    CHEMICAL, MATERIAL AND METALLURGICAL ENGINEERING III, PTS 1-3, 2014, 881-883 : 1540 - +
  • [37] A Cooperative Spectrum Sensing Algorithm Based on Principal Component Analysis and K-medoids Clustering
    Sun, Chenhao
    Wang, Yonghua
    Wan, Pin
    Du, Yiqi
    PROCEEDINGS 2018 33RD YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2018, : 835 - 839
  • [38] A K-medoids Clustering Algorithm with Initial Centers Optimized by a P System
    Li, Qian
    Liu, Xiyu
    HUMAN CENTERED COMPUTING, HCC 2014, 2015, 8944 : 488 - 500
  • [39] Fuzzy kernel K-medoids clustering algorithm for uncertain data objects
    Behnam Tavakkol
    Youngdoo Son
    Pattern Analysis and Applications, 2021, 24 : 1287 - 1302
  • [40] A Novel K-medoids clustering recommendation algorithm based on probability distribution for collaborative filtering
    Deng, Jiangzhou
    Guo, Junpeng
    Wang, Yong
    KNOWLEDGE-BASED SYSTEMS, 2019, 175 : 96 - 106