Improving k-means Clustering with Genetic Programming for Feature Construction

被引:3
|
作者
Lensen, Andrew [1 ]
Xue, Bing [1 ]
Zhang, Mengjie [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
来源
PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCO'17 COMPANION) | 2017年
关键词
Cluster Analysis; Feature Construction; Genetic Programming; k-means; Evolutionary Computation;
D O I
10.1145/3067695.3075962
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
k-means is one of the most commonly used clustering algorithms in data mining. Despite this, it has a number of fundamental limitations which prevent it from performing effectively on large or otherwise difficult datasets. A common technique to improve performance of data mining algorithms is feature construction, a technique which combines features together to produce more powerful constructed features that can improve the performance of a given algorithm. Genetic Programming (GP) has been used for feature construction very successfully, due to its program-like structure. This paper proposes two representations for using GP to perform feature construction to improve the performance of k-means, using a wrapper approach. Our results show significant improvements in performance compared to k-means using all original features across six difficult datasets.
引用
收藏
页码:237 / 238
页数:2
相关论文
共 50 条
  • [41] Improving k-Means Clustering Performance with Disentangled Internal Representations
    Agarap, Abien Fred
    Azcarraga, Arnulfo P.
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [42] An Optimized K-means Clustering for Improving Accuracy in Traffic Classification
    Shasha Zhao
    Yi Xiao
    Yueqiang Ning
    Yuxiao Zhou
    Dengying Zhang
    Wireless Personal Communications, 2021, 120 : 81 - 93
  • [43] An Application of K-Means Clustering for Improving Video Text Detection
    Aradhya, V. N. Manjunath
    Pavithra, M. S.
    INTELLIGENT INFORMATICS, 2013, 182 : 41 - +
  • [44] A combined K-means and hierarchical clustering method for improving the clustering efficiency of microarray
    Chen, TS
    Tsai, TH
    Chen, YT
    Lin, CC
    Chen, RC
    Li, SY
    Chen, HY
    ISPACS 2005: PROCEEDINGS OF THE 2005 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, 2005, : 405 - 408
  • [45] Selection of K in K-means clustering
    Pham, DT
    Dimov, SS
    Nguyen, CD
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2005, 219 (01) : 103 - 119
  • [46] Mixed integer linear programming formulation for K-means clustering problem
    Kolos Cs. Ágoston
    Marianna E.-Nagy
    Central European Journal of Operations Research, 2024, 32 : 11 - 27
  • [47] The new k-windows algorithm for improving the k-means clustering algorithm
    Vrahatis, MN
    Boutsinas, B
    Alevizos, P
    Pavlides, G
    JOURNAL OF COMPLEXITY, 2002, 18 (01) : 375 - 391
  • [48] Mixed integer linear programming formulation for K-means clustering problem
    Agoston, Kolos Cs.
    E-Nagy, Marianna
    CENTRAL EUROPEAN JOURNAL OF OPERATIONS RESEARCH, 2024, 32 (01) : 11 - 27
  • [49] Sketch-and-solve approaches to k-means clustering by semidefinite programming
    Clum, Charles
    Mixon, Dustin G.
    O'Hare, Kaiying
    Villar, Soledad
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2024, 13 (03)
  • [50] Geodesic K-means Clustering
    Asgharbeygi, Nima
    Maleki, Arian
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3450 - 3453