RETRACTED: An Ensemble Clustering Approach (Consensus Clustering) for High-Dimensional Data (Retracted Article)

被引:5
|
作者
Yan, Jingdong [1 ]
Liu, Wuwei [1 ]
机构
[1] Wuhan Univ Technol, Sch Management, Wuhan 430070, Hubei, Peoples R China
关键词
FLOW;
D O I
10.1155/2022/5629710
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the plurality of irrelevant attributes, sparse distribution, and complicated calculations in high-dimensional data, traditional clustering algorithms, such as K-means, do not perform well on high-dimensional data. To address the clustering problem of high-dimensional data, this paper studies an integrated clustering method for high-dimensional data. A method of subspace division based on minimum redundancy is proposed to solve the problem of subspace division of high-dimensional data; subspace division is improved by using the K-means algorithm. Additionally, this method uses mutual information between the characteristic variables of the data to replace the calculation in the K-means algorithm. The distance between the characteristic variables of the data is used to divide the data into subspaces according to the mutual information values between the characteristic variables of the data. To achieve high clustering accuracy and diversity based on clustering requirements, this paper uses a genetic algorithm as the consistency integration function. The fitness function is designed according to the clustering fusion target, and the selection operator is designed according to the maximum number of overlapping elements in the base clustering. The experimental results show that the clustering algorithm proposed in this paper outperforms other methods on most datasets and is an effective clustering integration algorithm. The proposed clustering algorithm is compared with other commonly used clustering fusion algorithms on datasets to prove the advantages of the proposed algorithm.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Ensemble Clustering for Boundary Detection in High-Dimensional Data
    Anagnostou, Panagiotis
    Pavlidis, Nicos G.
    Tasoulis, Sotiris
    [J]. MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, LOD 2023, PT II, 2024, 14506 : 324 - 333
  • [2] RETRACTED: Environmental data analysis based on fuzzy clustering method (Retracted Article)
    Li, Yongyi
    Yang, Zhongqiang
    Han, Kaixu
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL ENGINEERING EDUCATION, 2020,
  • [3] RETRACTED: RFID Data Analysis and Evaluation Based on Big Data and Data Clustering (Retracted Article)
    Lv, Lihua
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [4] Subspace-Weighted Consensus Clustering for High-Dimensional Data
    Cai, Xiaosha
    Huang, Dong
    [J]. ADVANCED DATA MINING AND APPLICATIONS, 2020, 12447 : 3 - 16
  • [5] Clustering High-Dimensional Data via Random Sampling and Consensus
    Traganitis, Panagiotis A.
    Slavakis, Konstantinos
    Giannakis, Georgios B.
    [J]. 2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 307 - 311
  • [6] High-dimensional data clustering
    Bouveyron, C.
    Girard, S.
    Schmid, C.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 502 - 519
  • [7] Clustering High-Dimensional Data
    Masulli, Francesco
    Rovetta, Stefano
    [J]. CLUSTERING HIGH-DIMENSIONAL DATA, CHDD 2012, 2015, 7627 : 1 - 13
  • [8] RETRACTED: Evaluation of English Proficiency Based on Big Data Clustering Algorithm (Retracted Article)
    Duan, Li
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [9] Subspace Clustering of High-Dimensional Data: An Evolutionary Approach
    Vijendra, Singh
    Laxman, Sahoo
    [J]. APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2013, 2013
  • [10] Subspace clustering of high-dimensional data: a predictive approach
    Brian McWilliams
    Giovanni Montana
    [J]. Data Mining and Knowledge Discovery, 2014, 28 : 736 - 772