Research on SCKM Algorithm Based on the Parallel Clustering

被引:0
|
作者
Zhang, Min [1 ]
Zang, ZhaoJie [1 ]
Niu, YuJun [1 ]
Shi, Longxiang [2 ]
机构
[1] Dalian Univ, Coll Informat & Engn, Dalian 116622, Peoples R China
[2] Tamkang Univ, Coll Sci, New Taipei 25137, Taiwan
关键词
spark; cluster; parallel;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
As an effective method in dealing with the massive data, the serial processing, aiming to obtain the useful information quickly, cannot satisfy our calculation requirements with high-performance. However, both distributed computing and parallel computing are good choices in calculating high-volume data with high-performance. As a parallel computing framework based on memory computing large data, Spark was of great concern as soon as it was proposed. This paper, based on the limitations of k-medoids algorithm which is sensitive to the center point and requires a large number of iterations in the process of calculating the new center point, puts forward a parallel algorithm named Canopy-Kmedoids within the platform of Spark. The algorithm obtains the K center points by the Canopy algorithm. By analyzing the performance of parallel operators and the advantage of using Spark to have good performance for iterative computation. In addition, this method reduces the frequency of reading or writing the shuffle and disk, which effectively overcomes the shortcomings of k-medoids. The experimental results show that the parallel algorithm achieves a relative ideal speedup ratio and can handle the massive data efficiently as well.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Research on HCKM Algorithm Based on Parallel Clustering
    Zhang, Min
    Zang, Zhao-jie
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTER SCIENCE (AICS 2016), 2016, : 178 - 182
  • [2] Research of parallel DBSCAN clustering algorithm based on MapReduce
    [J]. Fu, X. (xffu@gdut.edu.cn), 1600, Science and Engineering Research Support Society (07):
  • [3] Research on The parallel Text Clustering Algorithm Based on the Semantic Tree
    Liu, Gangfeng
    Wang, Yunlan
    Zhao, Tianhai
    Li, Dongyang
    [J]. 2011 6TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND CONVERGENCE INFORMATION TECHNOLOGY (ICCIT), 2012, : 400 - 403
  • [4] Research of text clustering based on hybrid Parallel Genetic Algorithm
    Dai, Wenhua
    Rao, Guizhen
    He, Tingting
    [J]. PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, PROCEEDINGS, 2007, : 28 - 31
  • [5] Research on Text Feature Clustering Based on Improved Parallel Genetic Algorithm
    Jiang, Mingyang
    Fan, Xiaojing
    Pei, Zhili
    Zhang, Zhifeng
    [J]. PROCEEDINGS OF 2018 TENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2018, : 235 - 238
  • [6] Research on Parallel Data Stream Clustering Algorithm based on Grid and Density
    Hu, Weihua
    Cheng, Mingzhong
    Wu, Guoping
    Wu, Liang
    [J]. 2015 International Conference on Computer Science and Mechanical Automation (CSMA), 2015, : 70 - 75
  • [7] Research of K-means clustering method based on parallel genetic algorithm
    Dai, Wenhua
    Jiao, Cuizhen
    He, Tingting
    [J]. 2007 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, VOL II, PROCEEDINGS, 2007, : 158 - +
  • [8] Clustering Algorithm Research Based on SOM
    Chen Xuimin
    Zou Kaiqi
    Chen Xiumin
    Fu ChangQing
    [J]. ICCSE 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION: ADVANCED COMPUTER TECHNOLOGY, NEW EDUCATION, 2008, : 27 - 31
  • [9] An efficient parallel direction-based clustering algorithm
    Zhong, Kai
    Zhou, Xu
    Zhou, Liqian
    Yang, Zhibang
    Liu, Chubo
    Xiao, Na
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 145 : 24 - 33
  • [10] Parallel Diffrential Evolution Clustering Algorithm based on MapReduce
    Daoudi, Meroua
    Hamena, Soumiya
    Benmounah, Zakaria
    Batouche, Mohamed
    [J]. 2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2014, : 337 - 341