An ensemble credit scoring model based on logistic regression with heterogeneous balancing and weighting effects

被引:10
|
作者
Runchi, Zhang [1 ]
Liguo, Xue [2 ]
Qin, Wang [1 ]
机构
[1] Nanjing Univ Posts & Telecommun, Sch Econ, 9 Wen Yuan Rd, Nanjing 210023, Jiangsu, Peoples R China
[2] Nanjing Univ, Sch Business, 22 Han Kou Rd, Nanjing 210093, Jiangsu, Peoples R China
关键词
Logistic regression; Logistic-BWE model; Sample balancing algorithm; Ensemble credit scoring models; Dynamic weighting; CLASSIFICATION; MACHINE;
D O I
10.1016/j.eswa.2022.118732
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The logistic regression model is widely used in credit scoring practice due to its strong interpretability of results, but its recognition performance for default samples which are minority in real-world imbalanced data sets need to be improved. This paper designs a novel ensemble model based on logistic regression as the logistic-BWE model. It first carries out data preprocessing, then applying sample balancing algorithm to generate several training sub data sets with different imbalance ratios and constructing sub models respectively, finally according to the performance of each sub model in the validation stage, the weight of predicted results for different class of each sub model is dynamically calculated. The empirical results indicate that compared with ten representative credit scoring models on six public data sets, the logistic-BWE model has the strongest ability to recognize default samples, and has the best generalization ability on most data sets while maintaining the interpretability. Further tests demonstrate that the performance superiority of the logistic-BWE model is statistically significant, and it also has excellent robustness when it contains a sufficient number of sub models.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Detection of uneven bias and score alignment in developing segmented credit scoring model through logistic regression approach
    Lee, TH
    Zhang, M
    [J]. AMERICAN STATISTICAL ASSOCIATION - 1996 PROCEEDINGS OF THE BUSINESS AND ECONOMIC STATISTICS SECTION, 1996, : 89 - 94
  • [42] Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects
    Dumitrescu, Elena
    Hue, Sullivan
    Hurlin, Christophe
    Tokpavi, Sessi
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2022, 297 (03) : 1178 - 1192
  • [43] A new hybrid ensemble credit scoring model based on classifiers consensus system approach
    Ala'raj, Maher
    Abbod, Maysam F.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2016, 64 : 36 - 55
  • [44] An Ensemble Classifier Model to Predict Credit Scoring - Comparative Analysis
    Parvin, A. Safiya
    Saleena, B.
    [J]. 2020 6TH IEEE INTERNATIONAL SYMPOSIUM ON SMART ELECTRONIC SYSTEMS (ISES 2020) (FORMERLY INIS), 2020, : 27 - 30
  • [45] Random effects logistic regression model for default prediction of technology credit guarantee fund
    Sohn, So Young
    Kim, Hong Sik
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 183 (01) : 472 - 478
  • [46] Credit scorecard based on logistic regression with random coefficients
    Dong, Gang
    Lai, Kin Keung
    Yen, Jerome
    [J]. ICCS 2010 - INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, PROCEEDINGS, 2010, 1 (01): : 2457 - 2462
  • [47] Credit evaluation model of credit card by using the hybrid model of neural network and Logistic regression
    Ma, Haiying
    [J]. SEVENTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS I-III: UNLOCKING THE FULL POTENTIAL OF GLOBAL TECHNOLOGY, 2008, : 1547 - 1551
  • [48] A Novel GSCI-Based Ensemble Approach for Credit Scoring
    Chen, Xiaohong
    Li, Siwei
    Xu, Xuanhua
    Meng, Fanyong
    Cao, Wenzhi
    [J]. IEEE ACCESS, 2020, 8 : 222449 - 222465
  • [49] The Listed Company's Credit Rating Based on PCA-Logistic Regression Model
    Shi, Mengwei
    Meng, Fanhua
    Song, Chunmei
    [J]. INTERNATIONAL JOINT CONFERENCE ON APPLIED MATHEMATICS, STATISTICS AND PUBLIC ADMINISTRATION (AMSPA 2014), 2014, : 528 - 534
  • [50] The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring
    Haberman, Shelby J.
    Sinharay, Sandip
    [J]. JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 2010, 35 (05) : 586 - 602