Enhancing credit scoring accuracy with a comprehensive evaluation of alternative data

被引:3
|
作者
Hlongwane, Rivalani [1 ]
Ramaboa, Kutlwano K. K. M. [1 ]
Mongwe, Wilson [2 ]
机构
[1] Univ Cape, Grad Sch Business, Cape Town, South Africa
[2] Univ Johannesburg, Elect & Elect Engn, Johannesburg, South Africa
来源
PLOS ONE | 2024年 / 19卷 / 05期
关键词
ALGORITHMS; DEFAULT; MODEL; OPTIMIZATION; PERFORMANCE; PREDICTION; XGBOOST; AREA;
D O I
10.1371/journal.pone.0303566
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This study explores the potential of utilizing alternative data sources to enhance the accuracy of credit scoring models, compared to relying solely on traditional data sources, such as credit bureau data. A comprehensive dataset from the Home Credit Group's home loan portfolio is analysed. The research examines the impact of incorporating alternative predictors that are typically overlooked, such as an applicant's social network default status, regional economic ratings, and local population characteristics. The modelling approach applies the model-X knockoffs framework for systematic variable selection. By including these alternative data sources, the credit scoring models demonstrate improved predictive performance, achieving an area under the curve metric of 0.79360 on the Kaggle Home Credit default risk competition dataset, outperforming models that relied solely on traditional data sources, such as credit bureau data. The findings highlight the significance of leveraging diverse, non-traditional data sources to augment credit risk assessment capabilities and overall model accuracy.
引用
收藏
页数:18
相关论文
共 50 条