Building a Decision Cluster Classification Model for High Dimensional Data by a Variable Weighting k-Means Method

被引:0
|
作者
Li, Yan [1 ]
Hung, Edward [1 ]
Chung, Korris [1 ]
Huang, Joshua [2 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
[2] Univ Hong Kong, East Bus Tech Inst, Hong Kong, Hong Kong, Peoples R China
关键词
Clustering; classification; W-k-means; k-NN;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a now classification method (ADCC) for high dimensional data is proposed. In this method, a decision cluster classification model (DCC) consists of a set of disjoint decision clusters, each labeled with a dominant class that determines the class of new objects falling in the cluster. A cluster tree is first generated from a training data set by recursively calling a variable weighting k-means algorithm. Then, the DCC model is selected from the tree. Anderson-Darling test is used to determine the stopping condition of the tree growing. A series of experiments on both synthetic and real data sets have shown that the new classification method (ADCC) performed better in accuracy and scalability than the existing methods of k-NN, decision tree and SVM. It is particularly suitable for large, high dimensional data with many classes.
引用
收藏
页码:337 / +
页数:3
相关论文
共 50 条
  • [1] USING A VARIABLE WEIGHTING k-MEANS METHOD TO BUILD A DECISION CLUSTER CLASSIFICATION MODEL
    Li, Yan
    Hung, Edward
    Chung, Korris
    Huang, Joshua
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2012, 26 (02)
  • [2] A new variable weighting and selection procedure for K-means cluster analysis
    Steinley, Douglas
    Brusco, Michael J.
    MULTIVARIATE BEHAVIORAL RESEARCH, 2008, 43 (01) : 77 - 108
  • [3] An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data
    Jing, Liping
    Ng, Michael K.
    Huang, Joshua Zhexue
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (08) : 1026 - 1041
  • [4] Automated variable weighting in k-means type clustering
    Huang, JZX
    Ng, MK
    Rong, HQ
    Li, ZC
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (05) : 657 - 668
  • [5] The k-means forest classifier for high dimensional data
    Chen, Zizhong
    Ding, Xin
    Xia, Shuyin
    Chen, Baiyun
    2018 9TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK), 2018, : 322 - 327
  • [6] An iterative algorithm for optimal variable weighting in K-means clustering
    Zhang, Shaonan
    Li, Shanshan
    Hu, Jiaqiao
    Xing, Haipeng
    Zhu, Wei
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2019, 48 (05) : 1346 - 1365
  • [7] A Data Classification Method Using Genetic Algorithm and K-Means Algorithm with Optimizing Initial Cluster Center
    Shi, Haobin
    Xu, Meng
    2018 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING TECHNOLOGY (CCET), 2018, : 224 - 228
  • [8] A Parallel K-means Algorithm for High Dimensional Text Data
    Shan, Xiaolei
    Shen, Yanming
    Wang, Yuxin
    2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-TAIWAN (ICCE-TW), 2018,
  • [9] Sparse kernel k-means for high-dimensional data
    Guan, Xin
    Terada, Yoshikazu
    PATTERN RECOGNITION, 2023, 144
  • [10] Solving k-means on High-Dimensional Big Data
    Kappmeier, Jan-Philipp W.
    Schmidt, Daniel R.
    Schmidt, Melanie
    EXPERIMENTAL ALGORITHMS, SEA 2015, 2015, 9125 : 259 - 270