Symbolic Regression with augmented dataset using RuleFit

被引:1
|
作者
de Franca, Fabricio Olivetti [1 ]
机构
[1] Univ Fed ABC, Ctr Math Comp & Cognit CMCC, Heurist & Anal Lab HAL, Santo Andre, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
symbolic regression; regression analysis; data augmentation;
D O I
10.1109/SYNASC57785.2022.00058
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Symbolic Regression models are often associated with transparency and interpretability. The main motivation is their ability to describe nonlinear models balancing accuracy and conciseness. But, in practice, it may generate models that are hard to understand at the same level as opaque models. From another perspective, linear models are guaranteed to be transparent but fail to model nonlinearities and interactions. The algorithm RuleFit uses a tree-based nonlinear model to create meta-features augmenting the dataset, increasing the accuracy of the linear models while maintaining their transparency. In this paper we test whether this augmented dataset can help Symbolic Regression models to find more transparent models without reducing the overall accuracy. The results indicate that the augmented models have a slightly better accuracy on a class of benchmarks while keeping the expression size small and closer to a linear model. As a caveat, the models also tend to become closer to a step function which limits the interpretability of the studied phenomena.
引用
收藏
页码:323 / 326
页数:4
相关论文
共 50 条
  • [41] Transformation of CPS coordinates using symbolic regression and genetic programming
    Chou, HJ
    Wu, CH
    Su, WH
    Proceedings of the 2005 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, 2005, : 301 - 306
  • [42] REGRESSION TECHNIQUES IN SOFTWARE EFFORT ESTIMATION USING COCOMO DATASET
    Anandhi, V.
    Chezian, R. Manicka
    2014 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING APPLICATIONS (ICICA 2014), 2014, : 353 - 357
  • [43] Generative Adversarial Network for Robust Regression using Continuous Dataset
    Min, Yu-Lim
    Hong, Seung-Jin
    Kim, Hye-jin
    Lee, Seung-Ik
    11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1209 - 1211
  • [44] Improvement of pulsars detection using dataset balancing methods and symbolic classification ensemble
    Andelic, N.
    ASTRONOMY AND COMPUTING, 2024, 47
  • [45] Smooth Symbolic Regression: Transformation of Symbolic Regression into a Real-Valued Optimization Problem
    Pitzer, Erik
    Kronberger, Gabriel
    COMPUTER AIDED SYSTEMS THEORY - EUROCAST 2015, 2015, 9520 : 375 - 383
  • [46] Prediction-based regularization using data augmented regression
    Giles Hooker
    Saharon Rosset
    Statistics and Computing, 2012, 22 : 237 - 249
  • [47] Prediction-based regularization using data augmented regression
    Hooker, Giles
    Rosset, Saharon
    STATISTICS AND COMPUTING, 2012, 22 (01) : 237 - 249
  • [48] Hand Orientation Regression Using Random Forest for Augmented Reality
    Asad, Muhammad
    Slabaugh, Greg
    AUGMENTED AND VIRTUAL REALITY, AVR 2014, 2014, 8853 : 159 - 174
  • [49] Bloat and Generalisation in Symbolic Regression
    Dick, Grant
    SIMULATED EVOLUTION AND LEARNING (SEAL 2014), 2014, 8886 : 491 - 502
  • [50] Symbolic-regression boosting
    Sipper, Moshe
    Moore, Jason H.
    GENETIC PROGRAMMING AND EVOLVABLE MACHINES, 2021, 22 (03) : 357 - 381