Improving k-means Clustering with Genetic Programming for Feature Construction

被引:3
|
作者
Lensen, Andrew [1 ]
Xue, Bing [1 ]
Zhang, Mengjie [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
来源
PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCO'17 COMPANION) | 2017年
关键词
Cluster Analysis; Feature Construction; Genetic Programming; k-means; Evolutionary Computation;
D O I
10.1145/3067695.3075962
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
k-means is one of the most commonly used clustering algorithms in data mining. Despite this, it has a number of fundamental limitations which prevent it from performing effectively on large or otherwise difficult datasets. A common technique to improve performance of data mining algorithms is feature construction, a technique which combines features together to produce more powerful constructed features that can improve the performance of a given algorithm. Genetic Programming (GP) has been used for feature construction very successfully, due to its program-like structure. This paper proposes two representations for using GP to perform feature construction to improve the performance of k-means, using a wrapper approach. Our results show significant improvements in performance compared to k-means using all original features across six difficult datasets.
引用
收藏
页码:237 / 238
页数:2
相关论文
共 50 条
  • [31] On the performance of feature weighting K-means for text subspace clustering
    Jing, LP
    Ng, MK
    Xu, J
    Huang, JZX
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 502 - 512
  • [32] The Global Kernel k-Means Algorithm for Clustering in Feature Space
    Tzortzis, Grigorios F.
    Likas, Aristidis C.
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2009, 20 (07): : 1181 - 1194
  • [33] Cloning Localization Based on Feature Extraction and K-means Clustering
    Alfraih, Areej S.
    Briffa, Johann A.
    Wesemeyer, Stephan
    DIGITAL-FORENSICS AND WATERMARKING, IWDW 2014, 2015, 9023 : 410 - 419
  • [34] Unsupervised Bayesian feature selection based on k-means clustering
    Yan, Liu
    Yan, Peng
    IC-BNMT 2007: PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON BROADBAND NETWORK & MULTIMEDIA TECHNOLOGY, 2007, : 352 - 356
  • [35] Network Pruning by Feature Map Sharing with K-Means Clustering
    Chiu, De-Yang
    Huang, Shih-Hsu
    2022 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN, IEEE ICCE-TW 2022, 2022, : 143 - 144
  • [36] Flexible Subspace Clustering: A Joint Feature Selection and K-Means Clustering Framework
    Long, Zhong-Zhen
    Xu, Guoxia
    Du, Jiao
    Zhu, Hu
    Yan, Taiyu
    Yu, Yu-Feng
    BIG DATA RESEARCH, 2021, 23
  • [37] K-Means Cloning: Adaptive Spherical K-Means Clustering
    Hedar, Abdel-Rahman
    Ibrahim, Abdel-Monem M.
    Abdel-Hakim, Alaa E.
    Sewisy, Adel A.
    ALGORITHMS, 2018, 11 (10):
  • [38] Improving the Scalability of a Prosumer Cooperative Game with K-Means Clustering
    Han, Liyang
    Morstyn, Thomas
    Crozier, Constance
    McCulloch, Malcolm
    2019 IEEE MILAN POWERTECH, 2019,
  • [39] An Optimized K-means Clustering for Improving Accuracy in Traffic Classification
    Zhao, Shasha
    Xiao, Yi
    Ning, Yueqiang
    Zhou, Yuxiao
    Zhang, Dengying
    WIRELESS PERSONAL COMMUNICATIONS, 2021, 120 (01) : 81 - 93
  • [40] K-Means Initialization Methods for Improving Clustering by Simulated Annealing
    Perim, Gabriela Trazzi
    Wandekokem, Estefhan Dazzi
    Varejao, Flavio Miguel
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2008, PROCEEDINGS, 2008, 5290 : 133 - 142