A Parallel K-Medoids Algorithm for Clustering based on MapReduce

被引:0
|
作者
Shafiq, M. Omair [1 ]
Torunski, Eric [1 ]
机构
[1] Carleton Univ, Sch Informat Technol, Ottawa, ON, Canada
关键词
Clustering; K-Medoids; Big Data; MapReduce;
D O I
10.1109/ICMLA.2016.196
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the most important machine learning techniques include clustering of data into different clusters or categories. There are several decent algorithms and techniques that exist to perform clustering on small to medium scale data. In the era of Big Data and with applications being large-scale and data-intensive in nature, there is a significant increment in volume, variety and velocity of data in the form of log events produced by such applications. This makes the task of clustering of huge amounts of data more challenging and limited. In this paper, we present our approach of a parallel K-Medoids clustering algorithm based on MapReduce paradigm to be able to perform clustering on large-scale of data. We have kept our solution simple and feasible to be used to handle huge volume, variety and velocity of data. Another key uniqueness in our proposed algorithm is that it can achieve parallelism independent of the number of k clusters to be formed, unlike other related approaches. We have tested our algorithm on large amounts of data and on a real-life case-study.
引用
收藏
页码:502 / 507
页数:6
相关论文
共 50 条
  • [1] Parallel K-Medoids Improved Algorithm Based on MapReduce
    Zhao, Yonghan
    Chen, Bin
    Li, Mengyu
    [J]. 2018 SIXTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2018, : 18 - 23
  • [2] Parallel K-Medoids Clustering Algorithm Based on Hadoop
    Jiang, Yaobin
    Zhang, Jiongmin
    [J]. 2014 5TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2014, : 649 - 652
  • [3] K-medoids Clustering Based on MapReduce and Optimal Search of Medoids
    Zhu, Ying-ting
    Wang, Fu-zhang
    Shan, Xing-hua
    Lv, Xiao-yan
    [J]. 2014 PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2014), 2014, : 573 - 577
  • [4] A genetic k-medoids clustering algorithm
    Weiguo Sheng
    Xiaohui Liu
    [J]. Journal of Heuristics, 2006, 12 : 447 - 466
  • [5] An improved k-medoids clustering algorithm
    Cao, Danyang
    Yang, Bingru
    [J]. 2010 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2010), VOL 3, 2010, : 132 - 135
  • [6] A genetic k-medoids clustering algorithm
    Sheng, Weiguo
    Liu, Xiaohui
    [J]. JOURNAL OF HEURISTICS, 2006, 12 (06) : 447 - 466
  • [7] An Efficient Density based Improved K-Medoids Clustering algorithm
    Pratap, Raghuvira A.
    Vani, K. Suvarna
    Devi, J. Rama
    Rao, K. Nageswara
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2011, 2 (06) : 49 - 54
  • [8] A Bisecting K-Medoids clustering Algorithm Based on Cloud Model
    Sun, D.
    Fei, H.
    Li, Q.
    [J]. IFAC PAPERSONLINE, 2018, 51 (11): : 308 - 315
  • [9] A simple and fast algorithm for K-medoids clustering
    Park, Hae-Sang
    Jun, Chi-Hyuck
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 3336 - 3341
  • [10] A K-medoids based Clustering Algorithm for Wireless Sensor Networks
    Wang, Jin
    Wang, Kai
    Niu, Junming
    Liu, Wei
    [J]. 2018 INTERNATIONAL WORKSHOP ON ADVANCED IMAGE TECHNOLOGY (IWAIT), 2018,