A Hierarchical Clustering for Categorical Data Based on Holo-entropy

被引:3
|
作者
Sun, Haojun [1 ]
Chen, Rongbo [1 ]
Jin, Shulin [1 ]
Qin, Yong [2 ]
机构
[1] Shantou Univ, Dept Comp Sci, Shantou, Peoples R China
[2] Beijing Jiaotong Univ, State Key Lab Rail Traff Control & Safety, Beijing, Peoples R China
关键词
Hierarchical Clustering; Holo-entropy; Subspace; Categorical Data; ALGORITHM;
D O I
10.1109/WISA.2015.18
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
High dimensional data clustering is a difficult task in clustering analysis. Subspace clustering is an effective approach. The principle of subspace clustering is to maximize the retention of the original data information while searching for the minimal size of subspace for cluster representation. Based on information entropy and Holo-entropy, we propose an adaptive high dimensional weighted subspace clustering algorithm. The algorithm employs information entropy to extract the feature subspace, uses class compactness which binding Holo-entropy with weight in subspace for sub-clusters merging instead of the traditional similarity measurement method, and it selects the most compacted two sub-clusters to merge to achieve the maximum degree clustering effect. The algorithm is tested on nine UCI dataset, and compared with other algorithms. Our algorithm is better in both efficiency and accuracy than the other existing algorithms and has high reproducibility.
引用
收藏
页码:269 / 274
页数:6
相关论文
共 50 条
  • [1] Holo-Entropy Based Categorical Data Hierarchical Clustering
    Sun, Haojun
    Chen, Rongbo
    Qin, Yong
    Wang, Shengrui
    [J]. INFORMATICA, 2017, 28 (02) : 303 - 328
  • [2] Model-Based Hierarchical Clustering for Categorical Data
    Alalyan, Fahdah
    Zamzami, Nuha
    Bouguila, Nizar
    [J]. 2019 IEEE 28TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2019, : 1424 - 1429
  • [3] Ordering of categorical data in hierarchical clustering
    Kazimianec, Michail
    [J]. DATABASES AND INFORMATION SYSTEMS, 2008, : 401 - 404
  • [4] Data Labeling method based on Rough Entropy for Categorical Data Clustering
    Sreenivasulu, G.
    Raju, S. Viswanadha
    Rao, N. Sambasiva
    [J]. 2014 INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATION AND COMPUTATIONAL ENGINEERING (ICECCE), 2014, : 173 - 178
  • [5] An entropy-based subspace clustering algorithm for categorical data
    Carbonera, Joel Luis
    Abel, Mara
    [J]. 2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 272 - 277
  • [6] Hierarchical density-based clustering of categorical data and a simplification
    Andreopoulos, Bill
    An, Aijun
    Wang, Xiaogang
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 11 - +
  • [7] Generalized Entropy and Projection Clustering of Categorical Data
    Simovici, Dan A.
    Cristofor, Dana
    Critofor, Laurentiu
    [J]. LECTURE NOTES IN COMPUTER SCIENCE <D>, 2000, 1910 : 619 - 625
  • [8] Hierarchical division clustering framework for categorical data
    Wei, Wei
    Liang, Jiye
    Guo, Xinyao
    Song, Peng
    Sun, Yijun
    [J]. NEUROCOMPUTING, 2019, 341 : 118 - 134
  • [9] A hierarchical clustering algorithm for categorical sequence data
    Oh, SJ
    Kim, JY
    [J]. INFORMATION PROCESSING LETTERS, 2004, 91 (03) : 135 - 140
  • [10] A subspace hierarchical clustering algorithm for categorical data
    Carbonera, Joel Luis
    Abel, Mara
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 509 - 516