The k-means forest classifier for high dimensional data

被引:0
|
作者
Chen, Zizhong [1 ]
Ding, Xin [1 ]
Xia, Shuyin [1 ]
Chen, Baiyun [1 ]
机构
[1] Chongqing Univ Posts & Telecommun, Sch Comp Sci & Technol, Chongqing, Peoples R China
基金
中国国家自然科学基金;
关键词
high dimensional data; attribute noise; k-means forest;
D O I
10.1109/ICBK.2018.00050
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The priority search k-means tree algorithm is the most effective k-nearest neighbor algorithm for high dimensional data as far as we know. However, this algorithm is sensitive to attribute noise which is common in high dimensional spaces. Therefore, this paper presents a new method named k-means forest that combines the priority search k-means tree algorithm with random forest. The main idea is to create multiple priority search k-means trees by randomly selecting a fixed number of attributes to make decisions and get the final result by voting. We also design a parallel version for the algorithm. The experimental results on artificial and public benchmark data sets demonstrate the effectiveness of the proposed method.
引用
下载
收藏
页码:322 / 327
页数:6
相关论文
共 50 条
  • [1] A Parallel K-means Algorithm for High Dimensional Text Data
    Shan, Xiaolei
    Shen, Yanming
    Wang, Yuxin
    2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-TAIWAN (ICCE-TW), 2018,
  • [2] Sparse kernel k-means for high-dimensional data
    Guan, Xin
    Terada, Yoshikazu
    PATTERN RECOGNITION, 2023, 144
  • [3] Solving k-means on High-Dimensional Big Data
    Kappmeier, Jan-Philipp W.
    Schmidt, Daniel R.
    Schmidt, Melanie
    EXPERIMENTAL ALGORITHMS, SEA 2015, 2015, 9125 : 259 - 270
  • [4] An AdaBoost Method with K'K-Means Bayes Classifier for Imbalanced Data
    Zhang, Yanfeng
    Wang, Lichun
    MATHEMATICS, 2023, 11 (08)
  • [5] Robust and sparse k-means clustering for high-dimensional data
    Šárka Brodinová
    Peter Filzmoser
    Thomas Ortner
    Christian Breiteneder
    Maia Rohm
    Advances in Data Analysis and Classification, 2019, 13 : 905 - 932
  • [6] Robust and sparse k-means clustering for high-dimensional data
    Brodinova, Sarka
    Filzmoser, Peter
    Ortner, Thomas
    Breiteneder, Christian
    Rohm, Maia
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (04) : 905 - 932
  • [7] Outlier Robust Geodesic K-means Algorithm for High Dimensional Data
    Hassanzadeh, Aidin
    Kaarna, Arto
    Kauranne, Tuomo
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2016, 2016, 10029 : 252 - 262
  • [8] A Novel K-Means Based Clustering Algorithm for High Dimensional Data Sets
    Khalilian, Madjid
    Mustapha, Norwati
    Suliman, Nasir
    Mamat, Ali
    INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS (IMECS 2010), VOLS I-III, 2010, : 503 - +
  • [9] Fast Adaptive K-Means Subspace Clustering for High-Dimensional Data
    Wang, Xiao-Dong
    Chen, Rung-Ching
    Yan, Fei
    Zeng, Zhi-Qiang
    Hong, Chao-Qun
    IEEE ACCESS, 2019, 7 : 42639 - 42651
  • [10] An investigation of K-means clustering to high and multi-dimensional biological data
    Baridam, Barilee B.
    Ali, M. Montaz
    KYBERNETES, 2013, 42 (04) : 614 - 627