Genetic programming for multiple-feature construction on high-dimensional classification

被引:56
|
作者
Binh Tran [1 ]
Xue, Bing [1 ]
Zhang, Mengjie [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
关键词
Feature construction; Genetic programming; Classification; Class dependence; High-dimensional data; SELECTION;
D O I
10.1016/j.patcog.2019.05.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data representation is an important factor in deciding the performance of machine learning algorithms including classification. Feature construction (FC) can combine original features to form high-level ones that can help classification algorithms achieve better performance. Genetic programming (GP) has shown promise in FC due to its flexible representation. Most GP methods construct a single feature, which may not scale well to high-dimensional data. This paper aims at investigating different approaches to constructing multiple features and analysing their effectiveness, efficiency, and underlying behaviours to reveal the insight of multiple-feature construction using GP on high-dimensional data. The results show that multiple-feature construction achieves significantly better performance than single-feature construction. In multiple-feature construction, using multi-tree GP representation is shown to be more effective than using the single-tree GP thanks to the ability to consider the interaction of the newly constructed features during the construction process. Class-dependent constructed features achieve better performance than the class-independent ones. A visualisation of the constructed features also demonstrates the interpretability of the GP-based FC approach, which is important to many real-world applications. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:404 / 417
页数:14
相关论文
共 50 条
  • [41] Multivariate Feature Ranking With High-Dimensional Data for Classification Tasks
    Jimenez, Fernando
    Sanchez, Gracia
    Palma, Jose
    Miralles-Pechuan, Luis
    Botia, Juan A.
    IEEE ACCESS, 2022, 10 : 60421 - 60437
  • [42] Simultaneous relevant feature identification and classification in high-dimensional spaces
    Grate, LR
    Bhattacharyya, C
    Jordan, MI
    Mian, IS
    ALGORITHMS IN BIOINFORMATICS, PROCEEDINGS, 2002, 2452 : 1 - 9
  • [43] Feature Extraction and Classification Models for High-dimensional Profile Data
    Shinde, Amit
    Church, George
    Janakiram, Mani
    Runger, George
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2011, 27 (07) : 885 - 893
  • [44] High-dimensional spectral data classification with nonparametric feature screening
    Li, Chuan-Quan
    Xu, Qing-Song
    JOURNAL OF CHEMOMETRICS, 2020, 34 (03)
  • [45] Improving Land Cover Classification Using Genetic Programming for Feature Construction
    Batista, Joao E.
    Cabral, Ana I. R.
    Vasconcelos, Maria J. P.
    Vanneschi, Leonardo
    Silva, Sara
    REMOTE SENSING, 2021, 13 (09)
  • [46] Classification in High-Dimensional Feature Spaces: Random Subsample Ensemble
    Serpen, Gursel
    Pathical, Santhosh
    EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 740 - 745
  • [47] High-Dimensional Unbalanced Binary Classification by Genetic Programming with Multi-Criterion Fitness Evaluation and Selection
    Pei, Wenbin
    Xue, Bing
    Shang, Lin
    Zhang, Mengjie
    EVOLUTIONARY COMPUTATION, 2022, 30 (01) : 99 - 129
  • [48] Semisupervised Classification With Novel Graph Construction for High-Dimensional Data
    Yu, Zhiwen
    Ye, Fengxu
    Yang, Kaixiang
    Cao, Wenming
    Chen, C. L. Philip
    Cheng, Lianglun
    You, Jane
    Wong, Hau-San
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (01) : 75 - 88
  • [49] Enhancing classification with hybrid feature selection: A multi-objective genetic algorithm for high-dimensional data
    Bohrer, Jonas da S.
    Dorn, Marcio
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [50] LAGAM: A Length-Adaptive Genetic Algorithm With Markov Blanket for High-Dimensional Feature Selection in Classification
    Zhou, Junhai
    Wu, Quanwang
    Zhou, MengChu
    Wen, Junhao
    Al-Turki, Yusuf
    Abusorrah, Abdullah
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (11) : 6858 - 6869