A Valid Clustering Algorithm for High-dimensional Large Data Sets Based on Distributed Method

被引:0
|
作者
Guo Xian e [1 ]
Yan Junmei [1 ]
机构
[1] Math & Comp Sci Inst, Datong, Shanxi, Peoples R China
关键词
fuzzy clustering; distributed method; genetic algorithm; fuzzy dissimilar matrix; large data sets; high dimension;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data sets are randomly divided into several subsets, then fuzzy clustering method for a A high-dimensional datas based on genetic algorithm is proposed to cluster the subsets, by importing a fuzzy dissimilar matrix to express the dissimilar degree between any two datas, and initializing the high-dimensional samples to two-dimensional plane. Then iteratively optimize the coordinate value of two-dimensional plane using genetic algorithm, which makes the Euclidean distance between the two-dimensional plane approximate to the fuzzy dissimilar degree between samples gradually. At last cluster the two-dimensional datas using FCM algorithm, so avoid dependence of clustering validity on the space distribution of high-dimensional samples. Experimental results show the method has high quality result, and improves the clustering speed greatly.
引用
收藏
页码:1 / 6
页数:6
相关论文
共 50 条
  • [21] An efficient clustering method of data mining for high-dimensional data
    Chang, JW
    Kang, HM
    [J]. 8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTING TECHNIQUES, 2004, : 273 - 278
  • [22] An Improved Initialization Method for Clustering High-Dimensional Data
    Zhang, Yanping
    Jiang, Qingshan
    [J]. 2010 2ND INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS PROCEEDINGS (DBTA), 2010,
  • [23] High-dimensional clustering method for high performance data mining
    Chang, Jae-Woo
    Lee, Hyun-Jo
    [J]. COMPUTATIONAL SCIENCE - ICCS 2007, PT 3, PROCEEDINGS, 2007, 4489 : 621 - +
  • [24] High-dimensional data clustering
    Bouveyron, C.
    Girard, S.
    Schmid, C.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 502 - 519
  • [25] An efficient clustering method for high-dimensional data mining
    Chang, JW
    Kim, YK
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE - SBIA 2004, 2004, 3171 : 276 - 285
  • [26] Distributed computation of the knn graph for large high-dimensional point sets
    Plaku, Erion
    Kavraki, Lydia E.
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2007, 67 (03) : 346 - 359
  • [27] Clustering High-Dimensional Data
    Masulli, Francesco
    Rovetta, Stefano
    [J]. CLUSTERING HIGH-DIMENSIONAL DATA, CHDD 2012, 2015, 7627 : 1 - 13
  • [28] GAUSSIAN PROCESSES FOR HIGH-DIMENSIONAL, LARGE DATA SETS: A REVIEW
    Jiang, Mengrui
    Pedrielli, Giulia
    Szu Hui Ng
    [J]. 2022 WINTER SIMULATION CONFERENCE (WSC), 2022, : 49 - 60
  • [29] A Clustering Algorithm for High-Dimensional Nonlinear Feature Data with Applications
    Jiang H.
    Wang G.
    Gao J.
    Gao Z.
    Gao R.
    Guo Q.
    [J]. Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2017, 51 (12): : 49 - 55and90
  • [30] Statistical method for clustering high-dimensional data based on fuzzy mathematical modeling
    Wang C.
    [J]. Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)