Symbolic regression as a feature engineering method for machine and deep learning regression tasks

被引:1
|
作者
Shmuel, Assaf [1 ]
Glickman, Oren [1 ]
Lazebnik, Teddy [2 ,3 ]
机构
[1] Bar Ilan Univ, Dept Comp Sci, Ramat Gan, Israel
[2] Ariel Univ, Dept Math, Ariel, Israel
[3] UCL, Canc Inst, Dept Canc Biol, London, England
来源
关键词
symbolic regression; neural network; data-driven physics; feature engineering; data science; FEATURE-SELECTION; BIG DATA; MODEL;
D O I
10.1088/2632-2153/ad513a
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the realm of machine and deep learning (DL) regression tasks, the role of effective feature engineering (FE) is pivotal in enhancing model performance. Traditional approaches of FE often rely on domain expertise to manually design features for machine learning (ML) models. In the context of DL models, the FE is embedded in the neural network's architecture, making it hard for interpretation. In this study, we propose to integrate symbolic regression (SR) as an FE process before a ML model to improve its performance. We show, through extensive experimentation on synthetic and 21 real-world datasets, that the incorporation of SR-derived features significantly enhances the predictive capabilities of both machine and DL regression models with 34%-86% root mean square error (RMSE) improvement in synthetic datasets and 4%-11.5% improvement in real-world datasets. In an additional realistic use case, we show the proposed method improves the ML performance in predicting superconducting critical temperatures based on Eliashberg theory by more than 20% in terms of RMSE. These results outline the potential of SR as an FE component in data-driven models, improving them in terms of performance and interpretability.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Active Learning Improves Performance on Symbolic Regression Tasks in StackGP
    Haut, Nathan
    Banzhaf, Wolfgang
    Punch, Bill
    PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2022, 2022, : 550 - 553
  • [2] Feature Standardisation in Symbolic Regression
    Owen, Caitlin A.
    Dick, Grant
    Whigham, Peter A.
    AI 2018: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11320 : 565 - 576
  • [3] Symbolic Regression on FPGAs for Fast Machine Learning Inference
    Tsoi, Ho Fung
    Pol, Adrian Alan
    Loncar, Vladimir
    Govorkova, Ekaterina
    Cranmer, Miles
    Dasu, Sridhara
    Elmer, Peter
    Harris, Philip
    Ojalvo, Isobel
    Pierini, Maurizio
    26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS, CHEP 2023, 2024, 295
  • [4] Deep Learning and Symbolic Regression for Discovering Parametric Equations
    Zhang, Michael
    Kim, Samuel
    Lu, Peter Y.
    Soljacic, Marin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (11) : 1 - 13
  • [5] Regression tasks in machine learning via Fenchel duality
    Radu Ioan Boţ
    André Heinrich
    Annals of Operations Research, 2014, 222 : 197 - 211
  • [6] Regression tasks in machine learning via Fenchel duality
    Bot, Radu Ioan
    Heinrich, Andre
    ANNALS OF OPERATIONS RESEARCH, 2014, 222 (01) : 197 - 211
  • [7] An active learning ensemble method for regression tasks
    Fazakis, Nikos
    Kostopoulos, Georgios
    Karlos, Stamatis
    Kotsiantis, Sotiris
    Sgarbas, Kyriakos
    INTELLIGENT DATA ANALYSIS, 2020, 24 (03) : 607 - 623
  • [8] Automatic feature engineering for regression models with machine learning: An evolutionary computation and statistics hybrid
    de Melo, Vinicius Veloso
    Banzhaf, Wolfgang
    INFORMATION SCIENCES, 2018, 430 : 287 - 313
  • [9] Interaction-transformation symbolic regression with extreme learning machine
    de Franca, Fabricio Olivetti
    de Lima, Maira Zabuscha
    NEUROCOMPUTING, 2021, 423 : 609 - 619
  • [10] Machine learning and symbolic regression investigation on stability of MXene materials
    He, Mu
    Zhang, Lei
    Zhang, Lei (002699@nuist.edu.cn), 1600, Elsevier B.V. (196):