Feature Selection to Improve Generalization of Genetic Programming for High-Dimensional Symbolic Regression

被引:89
|
作者
Chen, Qi [1 ]
Zhang, Mengjie [1 ]
Xue, Bing [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, Evolutionary Computat Res Grp, Wellington 6140, New Zealand
关键词
Feature selection; generalization; genetic programming (GP); symbolic regression (SR); CLASSIFICATION; OPTIMIZATION;
D O I
10.1109/TEVC.2017.2683489
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When learning from high-dimensional data for symbolic regression (SR), genetic programming (GP) typically could not generalize well. Feature selection, as a data preprocessing method, can potentially contribute not only to improving the efficiency of learning algorithms but also to enhancing the generalization ability. However, in GP for high-dimensional SR, feature selection before learning is seldom considered. In this paper, we propose a new feature selection method based on permutation to select features for high-dimensional SR using GP. A set of experiments has been conducted to investigate the performance of the proposed method on the generalization of GP for high-dimensional SR. The regression results confirm the superior performance of the proposed method over the other examined feature selection methods. Further analysis indicates that the models evolved by the proposed method are more likely to contain only the truly relevant features and have better interpretability.
引用
收藏
页码:792 / 806
页数:15
相关论文
共 50 条
  • [1] Improving Generalisation of Genetic Programming for High-Dimensional Symbolic Regression with Feature Selection
    Chen, Qi
    Xue, Bing
    Niu, Ben
    Zhang, Mengjie
    [J]. 2016 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2016, : 3793 - 3800
  • [2] Genetic Programming for Feature Selection Based on Feature Removal Impact in High-Dimensional Symbolic Regression
    Al-Helali, Baligh
    Chen, Qi
    Xue, Bing
    Zhang, Mengjie
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (03): : 2269 - 2282
  • [3] Genetic Programming with Embedded Feature Construction for High-Dimensional Symbolic Regression
    Chen, Qi
    Zhang, Mengjie
    Xue, Bing
    [J]. INTELLIGENT AND EVOLUTIONARY SYSTEMS, IES 2016, 2017, 8 : 87 - 102
  • [4] Genetic Programming for Imputation Predictor Selection and Ranking in Symbolic Regression with High-Dimensional Incomplete Data
    Al-Helali, Baligh
    Chen, Qi
    Xue, Bing
    Zhang, Mengjie
    [J]. AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 523 - 535
  • [5] Genetic programming for feature construction and selection in classification on high-dimensional data
    Binh Tran
    Bing Xue
    Mengjie Zhang
    [J]. Memetic Computing, 2016, 8 : 3 - 15
  • [6] Genetic programming for feature construction and selection in classification on high-dimensional data
    Binh Tran
    Xue, Bing
    Zhang, Mengjie
    [J]. MEMETIC COMPUTING, 2016, 8 (01) : 3 - 15
  • [7] Preconditioning for feature selection and regression in high-dimensional problems'
    Paul, Debashis
    Bair, Eric
    Hastie, Trevor
    Tibshirani, Robert
    [J]. ANNALS OF STATISTICS, 2008, 36 (04): : 1595 - 1618
  • [8] Efficient Learning and Feature Selection in High-Dimensional Regression
    Ting, Jo-Anne
    D'Souza, Aaron
    Vijayakumar, Sethu
    Schaal, Stefan
    [J]. NEURAL COMPUTATION, 2010, 22 (04) : 831 - 886
  • [9] Multi Hive Artificial Bee Colony Programming for high dimensional symbolic regression with feature selection
    Arslan, Sibel
    Ozturk, Celal
    [J]. APPLIED SOFT COMPUTING, 2019, 78 : 515 - 527
  • [10] A Comparative Analysis of Dimensionality Reduction Methods for Genetic Programming to Solve High-Dimensional Symbolic Regression Problems
    Zhong, Lianjie
    Zhong, Jinghui
    Lu, Chengyu
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 476 - 483