Firm failure prediction using genetic programming generated features

被引:2
|
作者
Zelenkov, Yuri [1 ]
机构
[1] HSE Univ, Grad Sch Business, 11 Pokrovsky Blv, Moscow 109028, Russia
关键词
Firm failure prediction; Genetic programming generated feature; Fitness function; Score of generated features; Unbalanced data; MULTIPLE-FEATURE CONSTRUCTION; FEATURE-SELECTION; BANKRUPTCY PREDICTION; CLASSIFICATION; ALGORITHM; CLASSIFIERS;
D O I
10.1016/j.eswa.2024.123839
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many studies on predicting firm failure have focused on finding new features that improve the accuracy of the models. In this paper, genetic programming (GP) is used for this purpose. The main problem in GP is to specify a function that evaluates the fitness of the feature. Direct optimization of a machine learning (ML) model that uses a generated feature in most cases leads to high computational costs since evolving a population of N programs over G generations while evaluating each model using K-fold cross-validation requires N*G*K model learning cycles. Thus, many researchers use scores that measure the relationship of the generated features to the class label. However, our empirical analysis shows that most such scores correlate poorly with ML model performance. The novelty of our work is that we introduce several ways of combining different scores into a single measure of expected model performance. Experimental results on data from Hungarian firms (7167 observations, class imbalance 9.37) using five ML models (Logistic Regression, Random Forest, Gradient Boosting, Histogram Boosting, and AdaBoost) prove that the proposed way of setting the fitness function increases the ROC AUC of the listed models by 6.6%, 5.2%, 6.8%, 5.5% and 5.2% respectively. Moreover, by applying the found formula to the data from Czech firms (3872 observations, class imbalance of 74.92), which were not used for the feature search, we obtained increases in ROC AUC by 13.1%, 11.8%, 14.9%, 9.9%, and 8.2%, respectively. This indicates that the proposed method allows to find universal features, which opens the way to build effective models in case of insufficient data (small number of observations, extreme imbalance, etc.).
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Wave Prediction Using Genetic Programming and Model Trees
    Rambekar, A. R.
    Deo, M. C.
    JOURNAL OF COASTAL RESEARCH, 2012, 28 (01) : 43 - 50
  • [22] Prediction of nonlinear system in optics using genetic programming
    Radi, Amr
    INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2007, 18 (03): : 369 - 374
  • [23] Prediction of microscopic residual stresses using genetic programming
    Millan, Laura
    Kronberger, Gabriel
    Fernanandez, Ricardo
    Bokuchava, Gizo
    Halodova, Patrice
    Saez-Maderuelo, Alberto
    Gonzalez-Doncel, Gaspar
    Hidalgo, J. Ignacio
    APPLICATIONS IN ENGINEERING SCIENCE, 2023, 15
  • [24] EEG based personality prediction using genetic programming
    Bhardwaj, Harshit
    Tomar, Pradeep
    Sakalle, Aditi
    Bhardwaj, Arpit
    Asthana, Rishi
    Vidyarthi, Ankit
    ASIAN JOURNAL OF CONTROL, 2023, 25 (05) : 3330 - 3342
  • [25] Prediction of wave ripple characteristics using genetic programming
    Goldstein, Evan B.
    Coco, Giovanni
    Murray, A. Brad
    CONTINENTAL SHELF RESEARCH, 2013, 71 : 1 - 15
  • [26] PREDICTION OF BRIDGE PIER SCOUR USING GENETIC PROGRAMMING
    Wang, Chuan-Yi
    Shih, Han-Peng
    Hong, Jian-Hao
    Raikar, Rajkumar V.
    JOURNAL OF MARINE SCIENCE AND TECHNOLOGY-TAIWAN, 2013, 21 (04): : 483 - 492
  • [27] The prediction of journey times on motorways using genetic programming
    Howard, D
    Roberts, SC
    APPLICATIONS OF EVOLUTIONARY COMPUTING, PROCEEDINGS, 2002, 2279 : 210 - 221
  • [28] Business failure prediction models with high and stable predictive power over time using genetic programming
    Beade, Angel
    Rodriguez, Manuel
    Santos, Jose
    OPERATIONAL RESEARCH, 2024, 24 (03)
  • [29] Texture classifiers generated by genetic programming
    Song, A
    Ciesielski, V
    Williams, HE
    CEC'02: PROCEEDINGS OF THE 2002 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2002, : 243 - 248
  • [30] Wear particle classification using genetic programming evolved features
    Xu, Bin
    Wen, Guangrui
    Zhang, Zhifen
    Chen, Feng
    LUBRICATION SCIENCE, 2018, 30 (05) : 229 - 246