Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects

被引:120
|
作者
Dumitrescu, Elena [1 ]
Hue, Sullivan [2 ]
Hurlin, Christophe [2 ]
Tokpavi, Sessi [2 ]
机构
[1] Univ Paris Nanterre, EconomiX CNRS, 200 Ave Republ, F-92000 Nanterre, France
[2] Univ Orleans, CNRS, LEO FRE 2014, Rue Blois, F-45067 Orleans, France
关键词
Risk management; Credit scoring; Machine learning; Interpretability; Econometrics; ART CLASSIFICATION ALGORITHMS; VARIABLE SELECTION; RULE EXTRACTION; NEURAL-NETWORKS; ADAPTIVE LASSO; MODELS; REGULARIZATION; PREDICTION; RISK;
D O I
10.1016/j.ejor.2021.06.053
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
In the context of credit scoring, ensemble methods based on decision trees, such as the random forest method, provide better classification performance than standard logistic regression models. However, logistic regression remains the benchmark in the credit risk industry mainly because the lack of interpretability of ensemble methods is incompatible with the requirements of financial regulators. In this paper, we propose a high-performance and interpretable credit scoring method called penalised logistic tree regression (PLTR), which uses information from decision trees to improve the performance of logistic regression. Formally, rules extracted from various short-depth decision trees built with original predictive variables are used as predictors in a penalised logistic regression model. PLTR allows us to capture non-linear effects that can arise in credit scoring data while preserving the intrinsic interpretability of the logistic regression model. Monte Carlo simulations and empirical applications using four real credit default datasets show that PLTR predicts credit risk significantly more accurately than logistic regression and compares competitively to the random forest method. (c) 2021 Published by Elsevier B.V.
引用
收藏
页码:1178 / 1192
页数:15
相关论文
共 50 条
  • [21] User Credit Ranking Based on Structured Non-linear Ordinal Regression
    Ren Y.
    Guo J.
    Zhang J.
    [J]. Zhang, Jing (zhangjing_0412@163.com), 1600, Science Press (33): : 839 - 851
  • [22] Non-linear Learning for Statistical Machine Translation
    Huang, Shujian
    Chen, Huadong
    Dai, Xinyu
    Chen, Jiajun
    [J]. PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, 2015, : 825 - 835
  • [23] Unweighted Fusion in Microphone Forensics using a Decision Tree and Linear Logistic Regression Models
    Kraetzer, Christian
    Schott, Maik
    Dittmann, Jana
    [J]. MM&SEC'09: PROCEEDINGS OF THE 2009 ACM SIGMM MULTIMEDIA AND SECURITY WORKSHOP, 2009, : 49 - 56
  • [24] A classification method based on non-linear SVM decision tree
    Zhao, Hui
    Yao, Yong
    Liu, Zhijing
    [J]. FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 4, PROCEEDINGS, 2007, : 635 - 638
  • [25] Evaluation of decision-tree models of machine learning for the prediction of acute liver failure after resuscitation
    Luckscheiter, A.
    Zink, W.
    Thiel, M.
    Viergutz, T.
    [J]. ANASTHESIOLOGIE & INTENSIVMEDIZIN, 2022, 63 : 350 - 361
  • [26] Non-parametric Statistical Analysis of Machine Learning Methods for Credit Scoring
    Garcia, V.
    Marques, A. I.
    Sanchez, J. S.
    [J]. MANAGEMENT INTELLIGENT SYSTEMS, 2012, 171 : 263 - +
  • [27] Machine Learning, Linear and Bayesian Models for Logistic Regression in Failure Detection Problems
    Pavlyshenko, B.
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 2046 - 2050
  • [28] Editorial: Non-linear analysis and machine learning in cardiology
    Dierckx, Hans
    Zhao, Xiaopeng
    Tolkacheva, Elena G.
    [J]. FRONTIERS IN PHYSIOLOGY, 2023, 14
  • [29] Maximum power point tracking using decision-tree machine-learning algorithm for photovoltaic systems
    Mahesh, P. Venkata
    Meyyappan, S.
    Alla, RamaKoteswara Rao
    [J]. CLEAN ENERGY, 2022, 6 (05): : 762 - 775
  • [30] Comparison of intracranial pressure prediction in hydrocephalus patients among linear, non-linear, and machine learning regression models in Thailand
    Trakulpanitkit, Avika
    Tunthanathip, Thara
    [J]. ACUTE AND CRITICAL CARE, 2023, 38 (03) : 362 - 370