A Valid Clustering Algorithm for High-dimensional Large Data Sets Based on Distributed Method

被引:0
|
作者
Guo Xian e [1 ]
Yan Junmei [1 ]
机构
[1] Math & Comp Sci Inst, Datong, Shanxi, Peoples R China
关键词
fuzzy clustering; distributed method; genetic algorithm; fuzzy dissimilar matrix; large data sets; high dimension;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data sets are randomly divided into several subsets, then fuzzy clustering method for a A high-dimensional datas based on genetic algorithm is proposed to cluster the subsets, by importing a fuzzy dissimilar matrix to express the dissimilar degree between any two datas, and initializing the high-dimensional samples to two-dimensional plane. Then iteratively optimize the coordinate value of two-dimensional plane using genetic algorithm, which makes the Euclidean distance between the two-dimensional plane approximate to the fuzzy dissimilar degree between samples gradually. At last cluster the two-dimensional datas using FCM algorithm, so avoid dependence of clustering validity on the space distribution of high-dimensional samples. Experimental results show the method has high quality result, and improves the clustering speed greatly.
引用
收藏
页码:1 / 6
页数:6
相关论文
共 50 条
  • [1] Clustering algorithm of high-dimensional data based on units
    School of In formation Engineering, Hubei Institute for Nationalities, Enshi 445000, China
    [J]. Jisuanji Yanjiu yu Fazhan, 2007, 9 (1618-1623): : 1618 - 1623
  • [2] Approximated clustering of distributed high-dimensional data
    Kriegel, HP
    Kunath, P
    Pfeifle, M
    Renz, M
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 432 - 441
  • [3] Scalable clustering for large high-dimensional data based on data summarization
    Lai, Ying
    Orlandic, Ratko
    Yee, Wai Gen
    Kulkarni, Sachin
    [J]. 2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, VOLS 1 AND 2, 2007, : 456 - 461
  • [4] Persistent homology based clustering algorithm for high-dimensional data
    Xiong Z.
    Wei Y.
    Xiong Z.
    He K.
    [J]. Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2024, 52 (02): : 29 - 35
  • [5] An efficient cell-based clustering method for handling large, high-dimensional data
    Chang, JW
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, 2003, 2637 : 295 - 300
  • [6] An algorithm for high-dimensional traffic data clustering
    Zheng, Pengjun
    McDonald, Mike
    [J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4223 : 59 - 68
  • [7] Sparse Kernel Clustering of Massive High-Dimensional Data sets with Large Number of Clusters
    Chitta, Radha
    Jain, Anil K.
    Jin, Rong
    [J]. PIKM'15: PROCEEDINGS OF THE 8TH PH.D. WORKSHOP IN INFORMATION AND KNOWLEDGE MANAGEMENT, 2015, : 11 - 18
  • [8] An Initialization Method for Clustering High-Dimensional Data
    Chen, Luying
    Chen, Lifei
    Jiang, Qingshan
    Wang, Beizhan
    Shi, Liang
    [J]. FIRST INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2009, : 444 - +
  • [10] A grid-based clustering algorithm for high-dimensional data streams
    Lu, YS
    Sun, YF
    Xu, GP
    Liu, G
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2005, 3584 : 824 - 831