Parallel K-means clustering algorithm on DNA dataset

被引:0
|
作者
Othman, F [1 ]
Abdullah, R [1 ]
Rashid, NA [1 ]
Salam, RA [1 ]
机构
[1] Univ Sains Malaysia, Sch Comp Sci, George Town 11800, Malaysia
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Clustering is a division of data into groups of similar objects. K-means has been used in many clustering work because of the ease of the algorithm. Our main effort is to parallelize the k-means clustering algorithm. The parallel version is implemented based on the inherent parallelism during the Distance Calculation and Centroid Update phases. The parallel K-means algorithm is designed in such a way that each P participating node is responsible for handling n/P data points. We run the program on a Linux Cluster with a maximum of eight nodes using message-passing programming model. We examined the performance based on the percentage of correct answers and its speed-up performance. The outcome shows that our parallel K-means program performs relatively well on large datasets.
引用
收藏
页码:248 / 251
页数:4
相关论文
共 50 条
  • [1] An Improved parallel K-means Clustering Algorithm with MapReduce
    Liao, Qing
    Yang, Fan
    Zhao, Jingming
    [J]. 2013 15TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT), 2013, : 764 - 768
  • [2] Enhanced Parallel Implementation of the K-Means Clustering Algorithm
    Baydoun, Mohammed
    Dawi, Mohammad
    Ghaziri, Hassan
    [J]. 2016 3RD INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTATIONAL TOOLS FOR ENGINEERING APPLICATIONS (ACTEA), 2016, : 7 - 11
  • [3] Parallel bisecting k-means with prediction clustering algorithm
    Li, Yanjun
    Chung, Soon M.
    [J]. JOURNAL OF SUPERCOMPUTING, 2007, 39 (01): : 19 - 37
  • [4] Parallel bisecting k-means with prediction clustering algorithm
    Yanjun Li
    Soon M. Chung
    [J]. The Journal of Supercomputing, 2007, 39 : 19 - 37
  • [5] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    [J]. 2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
  • [6] An Improved K-means Algorithm for DNA Sequence Clustering
    Aleb, Nassima
    Labidi, Narimane
    [J]. 2015 26TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2015, : 39 - 42
  • [7] A parallel clustering algorithm for images using GA and k-means
    Wang, Ze
    Xiao, Shengzhong
    Cai, HuanFu
    Wang, ChunMei
    [J]. INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2011, 14 (06): : 2163 - 2170
  • [8] CUDA-based parallel K-means clustering algorithm
    Huo, Yingqiu
    Qin, Renbo
    Xing, Caiyan
    Chen, Xi
    Fang, Yong
    [J]. Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2014, 45 (11): : 47 - 53
  • [9] Implementation of hadoop optimization K-means parallel clustering algorithm
    Huang, Suyu
    Tan, Lingli
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 125 : 160 - 160
  • [10] Application of Hybrid Clustering using Parallel K-Means Algorithm and DIANA Algorithm
    Umam, Khoirul
    Bustamam, Alhadi
    Lestari, Dian
    [J]. SYMPOSIUM ON BIOMATHEMATICS (SYMOMATH 2016), 2017, 1825