Improving k-means Clustering with Genetic Programming for Feature Construction

被引:3
|
作者
Lensen, Andrew [1 ]
Xue, Bing [1 ]
Zhang, Mengjie [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
关键词
Cluster Analysis; Feature Construction; Genetic Programming; k-means; Evolutionary Computation;
D O I
10.1145/3067695.3075962
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
k-means is one of the most commonly used clustering algorithms in data mining. Despite this, it has a number of fundamental limitations which prevent it from performing effectively on large or otherwise difficult datasets. A common technique to improve performance of data mining algorithms is feature construction, a technique which combines features together to produce more powerful constructed features that can improve the performance of a given algorithm. Genetic Programming (GP) has been used for feature construction very successfully, due to its program-like structure. This paper proposes two representations for using GP to perform feature construction to improve the performance of k-means, using a wrapper approach. Our results show significant improvements in performance compared to k-means using all original features across six difficult datasets.
引用
收藏
页码:237 / 238
页数:2
相关论文
共 50 条
  • [1] Feature weighting in k-means clustering
    Modha, DS
    Spangler, WS
    MACHINE LEARNING, 2003, 52 (03) : 217 - 237
  • [2] Feature Weighting in k-Means Clustering
    Dharmendra S. Modha
    W. Scott Spangler
    Machine Learning, 2003, 52 : 217 - 237
  • [3] Deterministic Feature Selection for k-Means Clustering
    Boutsidis, Christos
    Magdon-Ismail, Malik
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2013, 59 (09) : 6099 - 6110
  • [4] Statistically Improving K-means Clustering Performance
    Ihsanoglu, Abdullah
    Zaval, Mounes
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [5] Clustering with Niching Genetic K-means algorithm
    Sheng, WG
    Tucker, A
    Liu, XH
    GENETIC AND EVOLUTIONARY COMPUTATION GECCO 2004 , PT 2, PROCEEDINGS, 2004, 3103 : 162 - 173
  • [6] Modified K-Means Algorithm for Genetic Clustering
    Bonab, Mohammad Babrdel
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2011, 11 (09): : 24 - 28
  • [7] Empirical Evaluation of K-Means, Bisecting K-Means, Fuzzy C-Means and Genetic K-Means Clustering Algorithms
    Banerjee, Shreya
    Choudhary, Ankit
    Pal, Somnath
    2015 IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE), 2015, : 172 - 176
  • [8] K-means Clustering with Feature Selection for Stream Data
    Wang, Xiao-dong
    Chen, Rung-Ching
    Yan, Fei
    Hendry
    2018 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL (IS3C 2018), 2018, : 453 - 456
  • [9] Feature Selection Algorithm Based on K-means Clustering
    Tang, Xue
    Dong, Min
    Bi, Sheng
    Pei, Maofeng
    Cao, Dan
    Xie, Cheche
    Chi, Sunhuang
    2017 IEEE 7TH ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (CYBER), 2017, : 1522 - 1527
  • [10] KERNEL OVERLAPPING K-MEANS FOR CLUSTERING IN FEATURE SPACE
    Ben N'Cir, Chiheb-Eddine
    Essoussi, Nadia
    Bertrand, Patrice
    KDIR 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2010, : 250 - 256