A hybrid clustering algorithm based on missing attribute interval estimation for incomplete data

被引:23
|
作者
Zhang, Li [1 ]
Bing, Zhaohong [1 ]
Zhang, Liyong [2 ]
机构
[1] Liaoning Univ, Sch Informat, Shenyang 110036, Peoples R China
[2] Dalian Univ Technol, Sch Control Sci & Engn, Dalian 116024, Peoples R China
关键词
Incomplete data set; Intervals reconstruction; Particle swarm; Fuzzy c-means; Clustering; FUZZY C-MEANS; PARTICLE SWARM OPTIMIZATION; INFORMATION; IMPUTATION; MODELS;
D O I
10.1007/s10044-014-0376-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Partially missing data sets are a prevailing problem in clustering analysis. We propose a hybrid algorithm combining fuzzy clustering with particle swarm optimization (PSO) for incomplete data clustering, and missing attributes are represented as intervals. Furthermore, we develop a neighbor interval reconstruction (NIR) method based on pre-classification results that estimates the nearest-neighbor interval of missing attribute using the nearest-neighbor rule, which avoids endpoints of intervals determined by different species information, thereby improving the accuracy of missing attribute intervals and enhancing the robustness of missing attribute imputation. Then, the PSO and fuzzy c-means hybrid algorithm are used for clustering the interval-valued data set, and the global optimization ability of the PSO can improve the accuracy of clustering results compared with gradient-based optimization methods. The experimental results for several UCI data sets show the superiority of the proposed NIR hybrid algorithm.
引用
收藏
页码:377 / 384
页数:8
相关论文
共 50 条
  • [21] Possibility Clustering Algorithm for Incomplete Data Based on a Deep Computing Model
    Li, Dongping
    Yang, Yingchun
    Yue, Qiang
    Cheng, Liqi
    Song, Jie
    Liu, Yuyan
    JOURNAL OF INTERCONNECTION NETWORKS, 2022, 22 (SUPP03)
  • [22] Fuzzy C-means clustering algorithm based on incomplete data
    Jia, Zhiping
    Yu, Zhiqiang
    Zhang, Chenghui
    2006 IEEE INTERNATIONAL CONFERENCE ON INFORMATION ACQUISITION, VOLS 1 AND 2, CONFERENCE PROCEEDINGS, 2006, : 600 - 604
  • [23] A hybrid data clustering algorithm based on improved krill herd algorithm and KHM clustering
    Wang, Qiu-Ping
    Ding, Cheng
    Wang, Xiao-Feng
    Kongzhi yu Juece/Control and Decision, 2020, 35 (10): : 2449 - 2458
  • [24] Clustering-Based Hybrid Approach for Multivariate Missing Data Imputation
    Dubey, Aditya
    Rasool, Akhtar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (11) : 710 - 714
  • [25] Hybrid Algorithm to Data Clustering
    Gil, Miguel
    Ochoa, Alberto
    Zamarron, Antonio
    Carpio, Juan
    HYBRID ARTIFICIAL INTELLIGENCE SYSTEMS, 2009, 5572 : 678 - +
  • [26] An attribute reduction algorithm in the incomplete information system based on the attribute significance
    Zhen, Chen
    Xue, Xing Xiao
    PROCEEDINGS OF 2014 IEEE WORKSHOP ON ADVANCED RESEARCH AND TECHNOLOGY IN INDUSTRY APPLICATIONS (WARTIA), 2014, : 1405 - 1407
  • [27] Attribute reduction algorithm for incomplete decision table based on attribute discernibility
    Ji, X. (jixia1983@163.com), 1600, South China University of Technology (41):
  • [28] Missing data analyses: a hybrid multiple imputation algorithm using Gray System Theory and entropy based on clustering
    Jing Tian
    Bing Yu
    Dan Yu
    Shilong Ma
    Applied Intelligence, 2014, 40 : 376 - 388
  • [29] Missing data analyses: a hybrid multiple imputation algorithm using Gray System Theory and entropy based on clustering
    Tian, Jing
    Yu, Bing
    Yu, Dan
    Ma, Shilong
    APPLIED INTELLIGENCE, 2014, 40 (02) : 376 - 388
  • [30] A Grey System Based Missing Sensor Data Estimation Algorithm
    Liu, Feng
    You, Ziqi
    Shan, Wenze
    Liu, Jianxiao
    PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 482 - 486