The k-means forest classifier for high dimensional data

被引：0

作者：

Chen, Zizhong ^{[1
]}

Ding, Xin ^{[1
]}

Xia, Shuyin ^{[1
]}

Chen, Baiyun ^{[1
]}

机构：

[1] Chongqing Univ Posts & Telecommun, Sch Comp Sci & Technol, Chongqing, Peoples R China

来源：

2018 9TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK) | 2018年

基金：

中国国家自然科学基金;

关键词：

high dimensional data; attribute noise; k-means forest;

D O I：

10.1109/ICBK.2018.00050

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The priority search k-means tree algorithm is the most effective k-nearest neighbor algorithm for high dimensional data as far as we know. However, this algorithm is sensitive to attribute noise which is common in high dimensional spaces. Therefore, this paper presents a new method named k-means forest that combines the priority search k-means tree algorithm with random forest. The main idea is to create multiple priority search k-means trees by randomly selecting a fixed number of attributes to make decisions and get the final result by voting. We also design a parallel version for the algorithm. The experimental results on artificial and public benchmark data sets demonstrate the effectiveness of the proposed method.

引用

下载

页码：322 / 327

页数：6

共 50 条

[41] KmL: k-means for longitudinal data
Genolini, Christophe
Falissard, Bruno
COMPUTATIONAL STATISTICS, 2010, 25 (02) : 317 - 328
[42] KmL: k-means for longitudinal data
Christophe Genolini
Bruno Falissard
Computational Statistics, 2010, 25 : 317 - 328
[43] SPARSE k-MEANS WITH l∞/l0 PENALTY FOR HIGH-DIMENSIONAL DATA CLUSTERING
Chang, Xiangyu
Wang, Yu
Li, Rongjian
Xu, Zongben
STATISTICA SINICA, 2018, 28 (03) : 1265 - 1284
[44] An Intelligent Weighted Kernel K-Means Algorithm for High Dimension Data
Kenari, Abdolreza Rasouli
Bin Maarof, Mohd Aizaini
Sap, Mohd Noor Bin Md
Shamsi, Mahboubeh
2009 SECOND INTERNATIONAL CONFERENCE ON THE APPLICATIONS OF DIGITAL INFORMATION AND WEB TECHNOLOGIES (ICADIWT 2009), 2009, : 837 - 839
[45] K-means - a fast and efficient K-means algorithms
Nguyen C.D.
Duong T.H.
Nguyen, Cuong Duc (nguyenduccuong@tdt.edu.vn), 2018, Inderscience Publishers, 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (11) : 27 - 45
[46] Sparse K-Means with the lq(0 ≤ q < 1) Constraint for High-Dimensional Data Clustering
Wang, Yu
Chang, Xiangyu
Li, Rongjian
Xu, Zongben
2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 797 - 806
[47] PSO Based Fast K-means Algorithm for Feature Selection from High Dimensional Medical data set
Doreswamy
Salma, Umme M.
PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO'16), 2016,
[48] Unsupervised image segmentation by Bayesian discriminator starting with K-means classifier
Kotera, H
Horiuchi, T
IS&T'S NIP20: INTERNATIONAL CONFERENCE ON DIGITAL PRINTING TECHNOLOGIES, PROCEEDINGS, 2004, : 622 - 626
[49] An Integration of K-Means Clustering and Naive Bayes Classifier for Intrusion Detection
Varuna, S.
Natesan, P.
2015 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATION AND NETWORKING (ICSCN), 2015,
[50] On the optimal partitioning of data with K-means, growing K-means, neural gas, and growing neural gas
Daszykowski, M
Walczak, B
Massart, DL
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (06): : 1378 - 1389

← 1 2 3 4 5 →