The k-means forest classifier for high dimensional data

被引:0
|
作者
Chen, Zizhong [1 ]
Ding, Xin [1 ]
Xia, Shuyin [1 ]
Chen, Baiyun [1 ]
机构
[1] Chongqing Univ Posts & Telecommun, Sch Comp Sci & Technol, Chongqing, Peoples R China
基金
中国国家自然科学基金;
关键词
high dimensional data; attribute noise; k-means forest;
D O I
10.1109/ICBK.2018.00050
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The priority search k-means tree algorithm is the most effective k-nearest neighbor algorithm for high dimensional data as far as we know. However, this algorithm is sensitive to attribute noise which is common in high dimensional spaces. Therefore, this paper presents a new method named k-means forest that combines the priority search k-means tree algorithm with random forest. The main idea is to create multiple priority search k-means trees by randomly selecting a fixed number of attributes to make decisions and get the final result by voting. We also design a parallel version for the algorithm. The experimental results on artificial and public benchmark data sets demonstrate the effectiveness of the proposed method.
引用
下载
收藏
页码:322 / 327
页数:6
相关论文
共 50 条
  • [41] KmL: k-means for longitudinal data
    Genolini, Christophe
    Falissard, Bruno
    COMPUTATIONAL STATISTICS, 2010, 25 (02) : 317 - 328
  • [42] KmL: k-means for longitudinal data
    Christophe Genolini
    Bruno Falissard
    Computational Statistics, 2010, 25 : 317 - 328
  • [43] SPARSE k-MEANS WITH l∞/l0 PENALTY FOR HIGH-DIMENSIONAL DATA CLUSTERING
    Chang, Xiangyu
    Wang, Yu
    Li, Rongjian
    Xu, Zongben
    STATISTICA SINICA, 2018, 28 (03) : 1265 - 1284
  • [44] An Intelligent Weighted Kernel K-Means Algorithm for High Dimension Data
    Kenari, Abdolreza Rasouli
    Bin Maarof, Mohd Aizaini
    Sap, Mohd Noor Bin Md
    Shamsi, Mahboubeh
    2009 SECOND INTERNATIONAL CONFERENCE ON THE APPLICATIONS OF DIGITAL INFORMATION AND WEB TECHNOLOGIES (ICADIWT 2009), 2009, : 837 - 839
  • [45] K-means - a fast and efficient K-means algorithms
    Nguyen C.D.
    Duong T.H.
    Nguyen, Cuong Duc (nguyenduccuong@tdt.edu.vn), 2018, Inderscience Publishers, 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (11) : 27 - 45
  • [46] Sparse K-Means with the lq(0 ≤ q < 1) Constraint for High-Dimensional Data Clustering
    Wang, Yu
    Chang, Xiangyu
    Li, Rongjian
    Xu, Zongben
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 797 - 806
  • [47] PSO Based Fast K-means Algorithm for Feature Selection from High Dimensional Medical data set
    Doreswamy
    Salma, Umme M.
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO'16), 2016,
  • [48] Unsupervised image segmentation by Bayesian discriminator starting with K-means classifier
    Kotera, H
    Horiuchi, T
    IS&T'S NIP20: INTERNATIONAL CONFERENCE ON DIGITAL PRINTING TECHNOLOGIES, PROCEEDINGS, 2004, : 622 - 626
  • [49] An Integration of K-Means Clustering and Naive Bayes Classifier for Intrusion Detection
    Varuna, S.
    Natesan, P.
    2015 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATION AND NETWORKING (ICSCN), 2015,
  • [50] On the optimal partitioning of data with K-means, growing K-means, neural gas, and growing neural gas
    Daszykowski, M
    Walczak, B
    Massart, DL
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (06): : 1378 - 1389