Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects

被引:121
|
作者
Dumitrescu, Elena [1 ]
Hue, Sullivan [2 ]
Hurlin, Christophe [2 ]
Tokpavi, Sessi [2 ]
机构
[1] Univ Paris Nanterre, EconomiX CNRS, 200 Ave Republ, F-92000 Nanterre, France
[2] Univ Orleans, CNRS, LEO FRE 2014, Rue Blois, F-45067 Orleans, France
关键词
Risk management; Credit scoring; Machine learning; Interpretability; Econometrics; ART CLASSIFICATION ALGORITHMS; VARIABLE SELECTION; RULE EXTRACTION; NEURAL-NETWORKS; ADAPTIVE LASSO; MODELS; REGULARIZATION; PREDICTION; RISK;
D O I
10.1016/j.ejor.2021.06.053
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
In the context of credit scoring, ensemble methods based on decision trees, such as the random forest method, provide better classification performance than standard logistic regression models. However, logistic regression remains the benchmark in the credit risk industry mainly because the lack of interpretability of ensemble methods is incompatible with the requirements of financial regulators. In this paper, we propose a high-performance and interpretable credit scoring method called penalised logistic tree regression (PLTR), which uses information from decision trees to improve the performance of logistic regression. Formally, rules extracted from various short-depth decision trees built with original predictive variables are used as predictors in a penalised logistic regression model. PLTR allows us to capture non-linear effects that can arise in credit scoring data while preserving the intrinsic interpretability of the logistic regression model. Monte Carlo simulations and empirical applications using four real credit default datasets show that PLTR predicts credit risk significantly more accurately than logistic regression and compares competitively to the random forest method. (c) 2021 Published by Elsevier B.V.
引用
收藏
页码:1178 / 1192
页数:15
相关论文
共 50 条
  • [1] LOGISTIC REGRESSION AND MULTICRITERIA DECISION MAKING IN CREDIT SCORING
    Sarlija, Natasa
    Soric, Kristina
    Vlah, Silvija
    Rosenzweig, Visnja Vojvodic
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL SYMPOSIUM ON OPERATIONAL RESEARCH SOR 09, 2009, : 175 - +
  • [2] Small business credit scoring: A comparison of logistic regression, neural network, and decision tree models
    Zekic-Susac, M
    Sarlija, N
    Bensic, M
    [J]. ITI 2004: PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES, 2004, : 265 - 270
  • [3] Estimating credit and profit scoring of a Brazilian credit union with logistic regression and machine-learning techniques
    Vasconcellos de Paula, Daniel Abreu
    Artes, Rinaldo
    Ayres, Fabio
    Accioly Fonseca Minardi, Andrea Maria
    [J]. RAUSP MANAGEMENT JOURNAL, 2019, 54 (03): : 321 - 336
  • [4] A COMPARISON OF LOGISTIC-REGRESSION TO DECISION-TREE INDUCTION IN A MEDICAL DOMAIN
    LONG, WJ
    GRIFFITH, JL
    SELKER, HP
    DAGOSTINO, RB
    [J]. COMPUTERS AND BIOMEDICAL RESEARCH, 1993, 26 (01): : 74 - 97
  • [5] Credit card churn forecasting by logistic regression and decision tree
    Nie, Guangli
    Wei Rowe
    Zhang, Lingling
    Tian, Yingjie
    Shi, Yong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (12) : 15273 - 15285
  • [6] Machine learning and decision support system on credit scoring
    Teles, Gernmanno
    Rodrigues, Joel J. P. C.
    Saleem, Kashif
    Kozlov, Sergei
    Rabelo, Ricardo A. L.
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (14): : 9809 - 9826
  • [7] Machine learning and decision support system on credit scoring
    Gernmanno Teles
    Joel J. P. C. Rodrigues
    Kashif Saleem
    Sergei Kozlov
    Ricardo A. L. Rabêlo
    [J]. Neural Computing and Applications, 2020, 32 : 9809 - 9826
  • [8] Credit Rating Analysis by the Decision-Tree Support Vector Machine with Ensemble Strategies
    Pai, Ping-Feng
    Tan, Yi-Shien
    Hsu, Ming-Fu
    [J]. INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2015, 17 (04) : 521 - 530
  • [9] Credit Rating Analysis by the Decision-Tree Support Vector Machine with Ensemble Strategies
    Ping-Feng Pai
    Yi-Shien Tan
    Ming-Fu Hsu
    [J]. International Journal of Fuzzy Systems, 2015, 17 : 521 - 530
  • [10] Explainable Machine Learning for Improving Logistic Regression Models
    Yang, Yimin
    Wu, Min
    [J]. 2021 IEEE 19TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2021,