A Survey on Feature Weighting Based K-Means Algorithms

被引：53

作者：

de Amorim, Renato Cordeiro ^{[1
]}

机构：

[1] Univ Hertfordshire, Hatfield, Herts, England

来源：

JOURNAL OF CLASSIFICATION | 2016年 / 33卷 / 02期

关键词：

Feature weighting; K-Means; Partitional clustering; Feature selection; FEATURE-SELECTION; DATA SETS; MAXIMUM-LIKELIHOOD; CLUSTER-ANALYSIS; CLASSIFICATION; VARIABLES;

D O I：

10.1007/s00357-016-9208-4

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

In a real-world data set, there is always the possibility, rather high in our opinion, that different features may have different degrees of relevance. Most machine learning algorithms deal with this fact by either selecting or deselecting features in the data preprocessing phase. However, we maintain that even among relevant features there may be different degrees of relevance, and this should be taken into account during the clustering process. With over 50 years of history, K-Means is arguably the most popular partitional clustering algorithm there is. The first K-Means based clustering algorithm to compute feature weights was designed just over 30 years ago. Various such algorithms have been designed since but there has not been, to our knowledge, a survey integrating empirical evidence of cluster recovery ability, common flaws, and possible directions for future research. This paper elaborates on the concept of feature weighting and addresses these issues by critically analyzing some of the most popular, or innovative, feature weighting mechanisms based in K-Means.

引用

页码：210 / 242

页数：33

共 50 条

[1] A Survey on Feature Weighting Based K-Means Algorithms
Renato Cordeiro de Amorim
[J]. Journal of Classification, 2016, 33 : 210 - 242
[2] Feature weighting in k-means clustering
Modha, DS
Spangler, WS
[J]. MACHINE LEARNING, 2003, 52 (03) : 217 - 237
[3] Feature Weighting in k-Means Clustering
Dharmendra S. Modha
W. Scott Spangler
[J]. Machine Learning, 2003, 52 : 217 - 237
[4] On the performance of feature weighting K-means for text subspace clustering
Jing, LP
Ng, MK
Xu, J
Huang, JZX
[J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 502 - 512
[5] A Survey on Various K-Means algorithms for Clustering
Singh, Malwinder
Bansal, Meenakshi
[J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (06): : 60 - 65
[6] A Multi-Feature Weighting Based K-Means Algorithm for MOOC Learner Classification
Yang, Yuqing
Zhou, Dequn
Yang, Xiaojiang
[J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 59 (02): : 625 - 633
[7] Subspace clustering of text documents with feature weighting K-means algorithm
Jing, LP
Ng, MK
Xu, J
Huang, JZ
[J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 802 - 812
[8] K-means - a fast and efficient K-means algorithms
[J]. Nguyen, Cuong Duc (nguyenduccuong@tdt.edu.vn), 2018, Inderscience Publishers, 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (11)
[9] Kernel Penalized K-means: A feature selection method based on Kernel K-means
Maldonado, Sebastian
Carrizosa, Emilio
Weber, Richard
[J]. INFORMATION SCIENCES, 2015, 322 : 150 - 160
[10] An Analytic Survey on MapReduce based K-Means and its Hybrid Clustering Algorithms
Bagde, Utkarsha
Tripathi, Priyanka
[J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2018), 2018, : 32 - 36

← 1 2 3 4 5 →