Improving k-means Clustering with Genetic Programming for Feature Construction

被引:3
|
作者
Lensen, Andrew [1 ]
Xue, Bing [1 ]
Zhang, Mengjie [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
来源
PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCO'17 COMPANION) | 2017年
关键词
Cluster Analysis; Feature Construction; Genetic Programming; k-means; Evolutionary Computation;
D O I
10.1145/3067695.3075962
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
k-means is one of the most commonly used clustering algorithms in data mining. Despite this, it has a number of fundamental limitations which prevent it from performing effectively on large or otherwise difficult datasets. A common technique to improve performance of data mining algorithms is feature construction, a technique which combines features together to produce more powerful constructed features that can improve the performance of a given algorithm. Genetic Programming (GP) has been used for feature construction very successfully, due to its program-like structure. This paper proposes two representations for using GP to perform feature construction to improve the performance of k-means, using a wrapper approach. Our results show significant improvements in performance compared to k-means using all original features across six difficult datasets.
引用
收藏
页码:237 / 238
页数:2
相关论文
共 50 条
  • [21] Adaptive classifier based on K-means clustering and dynamic programming
    Navarro, A
    Allen, CR
    DOCUMENT RECOGNITION IV, 1997, 3027 : 31 - 38
  • [22] An Improved Genetic K-Means Algorithm for Spatial Clustering
    Wang, Yuanni
    Ge, Fei
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, 2008, : 123 - 126
  • [23] A genetic algorithm with gene rearrangement for K-means clustering
    Chang, Dong-Xia
    Zhang, Xian-Da
    Zheng, Chang-Wen
    PATTERN RECOGNITION, 2009, 42 (07) : 1210 - 1222
  • [24] An improved genetic k-means algorithm for optimal clustering
    Guo, Hai-Xiang
    Zhu, Ke-Jun
    Gao, Si-Wei
    Liu, Ting
    ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, 2006, : 793 - +
  • [25] A K-means Based Genetic Algorithm for Data Clustering
    Pizzuti, Clara
    Procopio, Nicola
    INTERNATIONAL JOINT CONFERENCE SOCO'16- CISIS'16-ICEUTE'16, 2017, 527 : 211 - 222
  • [26] Improving Clustering Method Performance Using K-Means, Mini Batch K-Means, BIRCH and Spectral
    Wahyuningrum, Tenia
    Khomsah, Siti
    Suyanto, Suyanto
    Meliana, Selly
    Yunanto, Prasti Eko
    Al Maki, Wikky F.
    2021 4TH INTERNATIONAL SEMINAR ON RESEARCH OF INFORMATION TECHNOLOGY AND INTELLIGENT SYSTEMS (ISRITI 2021), 2020,
  • [27] On K-means Data Clustering Algorithm with Genetic Algorithm
    Kapil, Shruti
    Chawla, Meenu
    Ansari, Mohd Dilshad
    2016 FOURTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2016, : 202 - 206
  • [28] Optimization of K-Means clustering Using Genetic Algorithm
    Irfan, Shadab
    Dwivedi, Gaurav
    Ghosh, Subhajit
    2017 INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES FOR SMART NATION (IC3TSN), 2017, : 157 - 162
  • [29] An Enhanced K-Means Genetic Algorithms for Optimal Clustering
    Anusha, M.
    Sathiaseelan, J. G. R.
    2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 580 - 584
  • [30] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67