Genetic programming for multiple-feature construction on high-dimensional classification

被引:56
|
作者
Binh Tran [1 ]
Xue, Bing [1 ]
Zhang, Mengjie [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
关键词
Feature construction; Genetic programming; Classification; Class dependence; High-dimensional data; SELECTION;
D O I
10.1016/j.patcog.2019.05.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data representation is an important factor in deciding the performance of machine learning algorithms including classification. Feature construction (FC) can combine original features to form high-level ones that can help classification algorithms achieve better performance. Genetic programming (GP) has shown promise in FC due to its flexible representation. Most GP methods construct a single feature, which may not scale well to high-dimensional data. This paper aims at investigating different approaches to constructing multiple features and analysing their effectiveness, efficiency, and underlying behaviours to reveal the insight of multiple-feature construction using GP on high-dimensional data. The results show that multiple-feature construction achieves significantly better performance than single-feature construction. In multiple-feature construction, using multi-tree GP representation is shown to be more effective than using the single-tree GP thanks to the ability to consider the interaction of the newly constructed features during the construction process. Class-dependent constructed features achieve better performance than the class-independent ones. A visualisation of the constructed features also demonstrates the interpretability of the GP-based FC approach, which is important to many real-world applications. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:404 / 417
页数:14
相关论文
共 50 条
  • [1] Genetic programming for feature construction and selection in classification on high-dimensional data
    Binh Tran
    Bing Xue
    Mengjie Zhang
    [J]. Memetic Computing, 2016, 8 : 3 - 15
  • [2] Genetic programming for feature construction and selection in classification on high-dimensional data
    Binh Tran
    Xue, Bing
    Zhang, Mengjie
    [J]. MEMETIC COMPUTING, 2016, 8 (01) : 3 - 15
  • [3] Genetic Programming with Embedded Feature Construction for High-Dimensional Symbolic Regression
    Chen, Qi
    Zhang, Mengjie
    Xue, Bing
    [J]. INTELLIGENT AND EVOLUTIONARY SYSTEMS, IES 2016, 2017, 8 : 87 - 102
  • [4] Multiple Feature Construction in Classification on High-Dimensional Data Using GP
    Binh Tran
    Zhang, Mengjie
    Xue, Bing
    [J]. PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [5] Genetic Programming for Multiple Feature Construction in Skin Cancer Image Classification
    Ul Ain, Qurrat
    Xue, Bing
    Al-Sahaf, Harith
    Zhang, Mengjie
    [J]. 2019 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2019,
  • [6] A hybrid multiple feature construction approach for classification using Genetic Programming
    Ma, Jianbin
    Teng, Guifa
    [J]. APPLIED SOFT COMPUTING, 2019, 80 : 687 - 699
  • [7] A Novel Multiobjective Genetic Programming Approach to High-Dimensional Data Classification
    Zhou, Yu
    Yang, Nanjian
    Huang, Xingyue
    Lee, Jaesung
    Kwong, Sam
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (09) : 5205 - 5216
  • [8] Genetic Programming Based on Granular Computing for Classification with High-Dimensional Data
    Pei, Wenbin
    Xue, Bing
    Shang, Lin
    Zhang, Mengjie
    [J]. AI 2018: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11320 : 643 - 655
  • [9] Genetic Programming for Borderline Instance Detection in High-dimensional Unbalanced Classification
    Pei, Wenbin
    Xue, Bing
    Shang, Lin
    Zhang, Mengjie
    [J]. PROCEEDINGS OF THE 2021 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'21), 2021, : 349 - 357
  • [10] Multiple Feature Construction for Effective Biomarker Identification and Classification using Genetic Programming
    Ahmed, Soha
    Zhang, Mengjie
    Peng, Lifeng
    Xue, Bing
    [J]. GECCO'14: PROCEEDINGS OF THE 2014 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2014, : 249 - 256