Bayesian Model Selection for Reducing Bloat and Overfitting in Genetic Programming for Symbolic Regression

被引:4
|
作者
Bomarito, G. F. [1 ]
Leser, P. E. [1 ]
Strauss, N. C. M. [2 ]
Garbrecht, K. M. [2 ]
Hochhalter, J. D. [2 ]
机构
[1] NASA Langley Res Ctr, Hampton, VA 23666 USA
[2] Univ Utah, Salt Lake City, UT USA
关键词
D O I
10.1145/3520304.3528899
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When performing symbolic regression using genetic programming, overfitting and bloat can negatively impact generalizability and interpretability of the resulting equations as well as increase computation times. A Bayesian fitness metric is introduced and its impact on bloat and overfitting during population evolution is studied and compared to common alternatives in the literature. The proposed approach was found to be more robust to noise and data sparsity in numerical experiments, guiding evolution to a level of complexity appropriate to the dataset. Further evolution of the population resulted not in overfitting or bloat, but rather in slight simplifications in model form. The ability to identify an equation of complexity appropriate to the scale of noise in the training data was also demonstrated. In general, the Bayesian model selection algorithm was shown to be an effective means of regularization which resulted in less bloat and overfitting when any amount of noise was present in the training data.
引用
收藏
页码:526 / 529
页数:4
相关论文
共 50 条
  • [1] Reducing bloat in genetic programming
    Monsieurs, P
    Flerackers, E
    [J]. COMPUTATIONAL INTELLIGENCE: THEORY AND APPLICATIONS, PROCEEDINGS, 2001, 2206 : 471 - 478
  • [2] Studying bloat control and maintenance of effective code in linear genetic programming for symbolic regression
    dal Piccol Sotto, Leo Francoso
    de Melo, Vinicius Veloso
    [J]. NEUROCOMPUTING, 2016, 180 : 79 - 93
  • [3] Semantic approximation for reducing code bloat in Genetic Programming
    Quang Uy Nguyen
    Thi Huong Chu
    [J]. SWARM AND EVOLUTIONARY COMPUTATION, 2020, 58
  • [4] CONSTRAINED GENETIC PROGRAMMING TO MINIMIZE OVERFITTING IN STOCK SELECTION
    Kim, Minkyu
    Becker, Ying L.
    Fei, Peng
    O'Reilly, Una-May
    [J]. GENETIC PROGRAMMING THEORY AND PRACTICE VI, 2009, : 179 - +
  • [5] Sequential Symbolic Regression with Genetic Programming
    Oliveira, Luiz Otavio V. B.
    Otero, Fernando E. B.
    Pappa, Gisele L.
    Albinati, Julio
    [J]. GENETIC PROGRAMMING THEORY AND PRACTICE XII, 2015, : 73 - 90
  • [6] Compositional Genetic Programming for Symbolic Regression
    Krawiec, Krzysztof
    Kossinski, Dominik
    [J]. PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2022, 2022, : 570 - 573
  • [7] Symbolic regression via genetic programming
    Augusto, DA
    Barbosa, HJC
    [J]. SIXTH BRAZILIAN SYMPOSIUM ON NEURAL NETWORKS, VOL 1, PROCEEDINGS, 2000, : 173 - 178
  • [8] Statistical genetic programming for symbolic regression
    Haeri, Maryam Amir
    Ebadzadeh, Mohammad Mehdi
    Folino, Gianluigi
    [J]. APPLIED SOFT COMPUTING, 2017, 60 : 447 - 469
  • [9] Taylor Genetic Programming for Symbolic Regression
    He, Baihe
    Lu, Qiang
    Yang, Qingyun
    Luo, Jake
    Wang, Zhiguang
    [J]. PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22), 2022, : 946 - 954
  • [10] On improving genetic programming for symbolic regression
    Gustafson, S
    Burke, EK
    Krasnogor, N
    [J]. 2005 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-3, PROCEEDINGS, 2005, : 912 - 919