A hybrid clustering algorithm based on missing attribute interval estimation for incomplete data

被引:23
|
作者
Zhang, Li [1 ]
Bing, Zhaohong [1 ]
Zhang, Liyong [2 ]
机构
[1] Liaoning Univ, Sch Informat, Shenyang 110036, Peoples R China
[2] Dalian Univ Technol, Sch Control Sci & Engn, Dalian 116024, Peoples R China
关键词
Incomplete data set; Intervals reconstruction; Particle swarm; Fuzzy c-means; Clustering; FUZZY C-MEANS; PARTICLE SWARM OPTIMIZATION; INFORMATION; IMPUTATION; MODELS;
D O I
10.1007/s10044-014-0376-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Partially missing data sets are a prevailing problem in clustering analysis. We propose a hybrid algorithm combining fuzzy clustering with particle swarm optimization (PSO) for incomplete data clustering, and missing attributes are represented as intervals. Furthermore, we develop a neighbor interval reconstruction (NIR) method based on pre-classification results that estimates the nearest-neighbor interval of missing attribute using the nearest-neighbor rule, which avoids endpoints of intervals determined by different species information, thereby improving the accuracy of missing attribute intervals and enhancing the robustness of missing attribute imputation. Then, the PSO and fuzzy c-means hybrid algorithm are used for clustering the interval-valued data set, and the global optimization ability of the PSO can improve the accuracy of clustering results compared with gradient-based optimization methods. The experimental results for several UCI data sets show the superiority of the proposed NIR hybrid algorithm.
引用
收藏
页码:377 / 384
页数:8
相关论文
共 50 条
  • [1] A hybrid clustering algorithm based on missing attribute interval estimation for incomplete data
    Li Zhang
    Zhaohong Bing
    Liyong Zhang
    Pattern Analysis and Applications, 2015, 18 : 377 - 384
  • [2] Fuzzy Clustering of Incomplete Data Based on Missing Attribute Interval Size
    Zhang, Li
    Li, Baoxing
    Zhang, Liyong
    Li, Dawei
    2015 IEEE 9TH INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (ASID), 2015, : 101 - 104
  • [3] A Global Clustering Approach Using Hybrid Optimization for Incomplete Data Based on Interval Reconstruction of Missing Value
    Zhang, Liyong
    Lu, Wei
    Liu, Xiaodong
    Pedrycz, Witold
    Zhong, Chongquan
    Wang, Lu
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2016, 31 (04) : 297 - 313
  • [4] A DATA STREAMS CLUSTERING ALGORITHM BASED ON INTERVAL DATA
    Li, Yan
    Ye, Ming
    Wang, Huiwen
    Liu, Dan
    Che, Yin
    PROCEEDINGS OF THE 38TH INTERNATIONAL CONFERENCE ON COMPUTERS AND INDUSTRIAL ENGINEERING, VOLS 1-3, 2008, : 2775 - 2778
  • [5] A Study of FCM Clustering Algorithm based on Interval Multiple Attribute Information
    Guo, Li
    Liu, Guofeng
    Bao, Yu'e
    ADVANCES IN COMPUTATIONAL MODELING AND SIMULATION, PTS 1 AND 2, 2014, 444-445 : 676 - 680
  • [6] Incremental Attribute Reduction Algorithm Based on Incomplete Hybrid Order Information System
    Chen B.
    Chen L.
    Deng M.
    Chen J.
    Gongcheng Kexue Yu Jishu/Advanced Engineering Sciences, 2024, 56 (01): : 65 - 81
  • [7] Power Incomplete Data Clustering Based on Fuzzy Fusion Algorithm
    Hong Y.
    Yan Y.
    Energy Engineering: Journal of the Association of Energy Engineering, 2023, 120 (01): : 245 - 261
  • [8] ESTIMATION OF MISSING VALUES FOR THE ANALYSIS OF INCOMPLETE DATA
    WILKINSON, GN
    BIOMETRICS, 1958, 14 (02) : 257 - 286
  • [9] Attribute selection approaches for incomplete interval-value data
    Li, Zhaowen
    Liao, Shimin
    Qu, Liangdong
    Song, Yan
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (05) : 8775 - 8792
  • [10] Estimation of Missing Values in Incomplete Industrial Process Data Sets Using ECM Algorithm
    Pirehgalin, Mina Fahimi
    Vogel-Heuser, Birgit
    2018 IEEE 16TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2018, : 245 - 251