Rough subspace-based clustering ensemble for categorical data

被引:19
|
作者
Gao, Can [1 ,2 ]
Pedrycz, Witold [2 ,3 ]
Miao, Duoqian [1 ]
机构
[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China
[2] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6G 2G7, Canada
[3] Polish Acad Sci, Syst Res Inst, PL-01447 Warsaw, Poland
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Categorical data; Rough sets; Fuzzy k-modes; Clustering ensemble; Cluster cardinality index; CLASS DISCOVERY; CONSENSUS; FRAMEWORK; MODEL;
D O I
10.1007/s00500-012-0972-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering categorical data arising as an important problem of data mining has recently attracted much attention. In this paper, the problem of unsupervised dimensionality reduction for categorical data is first studied. Based on the theory of rough sets, the attributes of categorical data are decomposed into a number of rough subspaces. A novel clustering ensemble algorithm based on rough subspaces is then proposed to deal with categorical data. The algorithm employs some of rough subspaces with high quality to cluster the data and yields a robust and stable solution by exploiting the resulting partitions. We also introduce a cluster index to evaluate the solution of clustering algorithm for categorical data. Experimental results for selected UCI data sets show that the proposed method produces better results than those obtained by other methods when being evaluated in terms of cluster validity indexes.
引用
收藏
页码:1643 / 1658
页数:16
相关论文
共 50 条
  • [1] Rough subspace-based clustering ensemble for categorical data
    Can Gao
    Witold Pedrycz
    Duoqian Miao
    [J]. Soft Computing, 2013, 17 : 1643 - 1658
  • [2] Ensemble based rough fuzzy clustering for categorical data
    Saha, Indrajit
    Sarkar, Jnanendra Prasad
    Maulik, Ujjwal
    [J]. KNOWLEDGE-BASED SYSTEMS, 2015, 77 : 114 - 127
  • [3] Categorical Data Clustering Based on Cluster Ensemble Process
    Veeraiah, D.
    Vasumathi, D.
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, ICICT 2015, VOL 2, 2016, 439 : 101 - 111
  • [4] An entropy-based subspace clustering algorithm for categorical data
    Carbonera, Joel Luis
    Abel, Mara
    [J]. 2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 272 - 277
  • [5] Fuzzy rough clustering for categorical data
    Xu, Shuliang
    Liu, Shenglan
    Zhou, Jian
    Feng, Lin
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (11) : 3213 - 3223
  • [6] Fuzzy rough clustering for categorical data
    Shuliang Xu
    Shenglan Liu
    Jian Zhou
    Lin Feng
    [J]. International Journal of Machine Learning and Cybernetics, 2019, 10 : 3213 - 3223
  • [7] Data Labeling method based on Rough Entropy for Categorical Data Clustering
    Sreenivasulu, G.
    Raju, S. Viswanadha
    Rao, N. Sambasiva
    [J]. 2014 INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATION AND COMPUTATIONAL ENGINEERING (ICECCE), 2014, : 173 - 178
  • [8] Kernel Subspace Clustering Algorithm for Categorical Data
    Xu, Kun-Peng
    Chen, Li-Fei
    Sun, Hao-Jun
    Wang, Bei-Zhan
    [J]. Ruan Jian Xue Bao/Journal of Software, 2020, 31 (11): : 3492 - 3505
  • [9] A subspace hierarchical clustering algorithm for categorical data
    Carbonera, Joel Luis
    Abel, Mara
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 509 - 516
  • [10] Parallel Hierarchical Subspace Clustering of Categorical Data
    Pang, Ning
    Zhang, Jifu
    Zhang, Chaowei
    Qin, Xiao
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (04) : 542 - 555