A Multi-View Clustering Algorithm for Mixed Numeric and Categorical Data

被引:6
|
作者
Ji, Jinchao [1 ,2 ,3 ]
Li, Ruonan [1 ,2 ,3 ]
Pang, Wei [4 ,5 ]
He, Fei [1 ,2 ,3 ]
Feng, Guozhong [1 ,2 ,3 ]
Zhao, Xiaowei [1 ,2 ,3 ]
机构
[1] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun 130117, Peoples R China
[2] Northeast Normal Univ, Inst Computat Biol, Changchun 130117, Peoples R China
[3] Northeast Normal Univ, Key Lab Intelligent Informat Proc Jilin Univ, Changchun 130117, Peoples R China
[4] Heriot Watt Univ, Sch Math & Comp Sci, Edinburgh EH14 4AS, Midlothian, Scotland
[5] Xian Univ Technol, Shaanxi Key Lab Complex Syst Control & Intelligen, Xian 710048, Peoples R China
基金
中国国家自然科学基金;
关键词
Data clustering; multi-view learning; mixed data; numeric and categorical attributes;
D O I
10.1109/ACCESS.2021.3057113
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering data with both numeric and categorical attributes is of great importance as such data are ubiquitous in real-world problems. Multi-view learning approaches have proven to be more effective and having better generalisation ability compared to single-view learning in many problems. However, most of the existing clustering algorithms developed for mixed numeric and categorical data are single-view. In this research, we propose a novel multi-view clustering algorithm based on the k-prototypes (which we term Multi-view K-Prototypes) for clustering mixed data. To the best of our knowledge, our proposed Multi-view K-Prototypes is the first multi-view version of the well-known k-prototypes algorithm. To cluster the mixed data over multiple views, we present a novel representation prototype of cluster centres in the scenario of multiple views, and we also devise formulas for updating the cluster centres over each view. Then we propose the concept of consensus cluster centres to output the final clustering result. Finally, we carried out a series of experiments on four benchmark datasets to assess the performance of the proposed Multi-view K-Prototypes clustering. Experimental results show that the Multi-view K-Prototypes algorithm outperforms the seven state-of-the-art algorithms in most cases.
引用
收藏
页码:24913 / 24924
页数:12
相关论文
共 50 条
  • [1] A CLUSTERING ALGORITHM FOR MIXED NUMERIC AND CATEGORICAL DATA
    Ohn Mar San
    Van-Nam Huynh
    Yoshiteru Nakamori
    [J]. Journal of Systems Science & Complexity, 2003, (04) : 562 - 571
  • [2] Algorithm for fuzzy clustering of mixed data with numeric and categorical attributes
    Ahmad, A
    Dey, L
    [J]. DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, PROCEEDINGS, 2005, 3816 : 561 - 572
  • [3] Clustering algorithm for incomplete data sets with mixed numeric and categorical attributes
    Sen, Wu
    Hong, Chen
    Xiaodong, Feng
    [J]. International Journal of Database Theory and Application, 2013, 6 (05): : 95 - 104
  • [4] A k-mean clustering algorithm for mixed numeric and categorical data
    Ahmad, Amir
    Dey, Lipika
    [J]. DATA & KNOWLEDGE ENGINEERING, 2007, 63 (02) : 503 - 527
  • [5] An improved k-prototypes clustering algorithm for mixed numeric and categorical data
    Ji, Jinchao
    Bai, Tian
    Zhou, Chunguang
    Ma, Chao
    Wang, Zhe
    [J]. NEUROCOMPUTING, 2013, 120 : 590 - 596
  • [6] Optimization of the Numeric and Categorical Attribute Weights in KAMILA Mixed Data Clustering Algorithm
    Martarelli, Nadia Junqueira
    Nagano, Marcelo Seido
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2019, PT I, 2019, 11871 : 20 - 27
  • [7] A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data
    Ji, Jinchao
    Pang, Wei
    Zhou, Chunguang
    Han, Xiao
    Wang, Zhe
    [J]. KNOWLEDGE-BASED SYSTEMS, 2012, 30 : 129 - 135
  • [8] Clustering Mixed Numeric and Categorical Data With Cuckoo Search
    Ji, Jinchao
    Pang, Wei
    Li, Zairong
    He, Fei
    Feng, Guozhong
    Zhao, Xiaowei
    [J]. IEEE ACCESS, 2020, 8 : 30988 - 31003
  • [9] An Affinity Propagation Clustering Algorithm for Mixed Numeric and Categorical Datasets
    Zhang, Kang
    Gu, Xingsheng
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
  • [10] Fuzzy K-prototypes algorithm for clustering mixed numeric and categorical valued data
    Chen, Ning
    Chen, An
    Zhou, Long-Xiang
    [J]. Ruan Jian Xue Bao/Journal of Software, 2001, 12 (08): : 1107 - 1119