Ensemble-Based Machine Learning Algorithm for Loan Default Risk Prediction

被引:0
|
作者
Akinjole, Abisola [1 ]
Shobayo, Olamilekan [1 ]
Popoola, Jumoke [1 ]
Okoyeigbo, Obinna [2 ]
Ogunleye, Bayode [3 ]
机构
[1] Sheffield Hallam Univ, Dept Comp, Sheffield S1 2NU, England
[2] Edge Hill Univ, Dept Psychol, Ormskirk L39 4QP, England
[3] Univ Brighton, Dept Comp & Math, Brighton BN2 4GJ, England
关键词
credit default prediction; deep learning; ensemble learning; machine learning; CREDIT; NETWORK; TREES; SMOTE;
D O I
10.3390/math12213423
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Predicting credit default risk is important to financial institutions, as accurately predicting the likelihood of a borrower defaulting on their loans will help to reduce financial losses, thereby maintaining profitability and stability. Although machine learning models have been used in assessing large applications with complex attributes for these predictions, there is still a need to identify the most effective techniques for the model development process, including the technique to address the issue of data imbalance. In this research, we conducted a comparative analysis of random forest, decision tree, SVMs (Support Vector Machines), XGBoost (Extreme Gradient Boosting), ADABoost (Adaptive Boosting) and the multi-layered perceptron, to predict credit defaults using loan data from LendingClub. Additionally, XGBoost was used as a framework for testing and evaluating various techniques. Moreover, we applied this XGBoost framework to handle the issue of class imbalance observed, by testing various resampling methods such as Random Over-Sampling (ROS), the Synthetic Minority Over-Sampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), Random Under-Sampling (RUS), and hybrid approaches like the SMOTE with Tomek Links and the SMOTE with Edited Nearest Neighbours (SMOTE + ENNs). The results showed that balanced datasets significantly outperformed the imbalanced dataset, with the SMOTE + ENNs delivering the best overall performance, achieving an accuracy of 90.49%, a precision of 94.61% and a recall of 92.02%. Furthermore, ensemble methods such as voting and stacking were employed to enhance performance further. Our proposed model achieved an accuracy of 93.7%, a precision of 95.6% and a recall of 95.5%, which shows the potential of ensemble methods in improving credit default predictions and can provide lending platforms with the tool to reduce default rates and financial losses. In conclusion, the findings from this study have broader implications for financial institutions, offering a robust approach to risk assessment beyond the LendingClub dataset.
引用
收藏
页数:31
相关论文
共 50 条
  • [1] Explainable prediction of loan default based on machine learning models
    Zhu X.
    Chu Q.
    Song X.
    Hu P.
    Peng L.
    Data Science and Management, 2023, 6 (03): : 123 - 133
  • [2] Ensemble-Based Risk Scoring with Extreme Learning Machine for Prediction of Adverse Cardiac Events
    Nan Liu
    Jeffrey Tadashi Sakamoto
    Jiuwen Cao
    Zhi Xiong Koh
    Andrew Fu Wah Ho
    Zhiping Lin
    Marcus Eng Hock Ong
    Cognitive Computation, 2017, 9 : 545 - 554
  • [3] Ensemble-Based Risk Scoring with Extreme Learning Machine for Prediction of Adverse Cardiac Events
    Liu, Nan
    Sakamoto, Jeffrey Tadashi
    Cao, Jiuwen
    Koh, Zhi Xiong
    Ho, Andrew Fu Wah
    Lin, Zhiping
    Ong, Marcus Eng Hock
    COGNITIVE COMPUTATION, 2017, 9 (04) : 545 - 554
  • [4] Prediction and Analysis of Financial Default Loan Behavior Based on Machine Learning Model
    Chen, Herui
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [5] Towards a Machine Learning-based Model for Corporate Loan Default Prediction
    Berrada, Imane Rhzioual
    Barramou, Fatimazahra
    Alami, Omar Bachir
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (03) : 565 - 573
  • [6] Prediction of landslide displacement with an ensemble-based extreme learning machine and copula models
    Li, Huajin
    Xu, Qiang
    He, Yusen
    Deng, Jiahao
    LANDSLIDES, 2018, 15 (10) : 2047 - 2059
  • [7] Ensemble-based machine learning models for phase prediction in high entropy alloys
    Mishra, Aayesha
    Kompella, Lakshminarayana
    Sanagavarapu, Lalit Mohan
    Varam, Sreedevi
    COMPUTATIONAL MATERIALS SCIENCE, 2022, 210
  • [8] Prediction of drug synergy in cancer using ensemble-based machine learning techniques
    Singh, Harpreet
    Rana, Prashant Singh
    Singh, Urvinder
    MODERN PHYSICS LETTERS B, 2018, 32 (11):
  • [9] Prediction of landslide displacement with an ensemble-based extreme learning machine and copula models
    Huajin Li
    Qiang Xu
    Yusen He
    Jiahao Deng
    Landslides, 2018, 15 : 2047 - 2059
  • [10] Transfer Learning and Loan Default Prediction
    Feinberg, Tzvi
    Semenov, Alexander
    Guan, Yongpei
    Grigoriev, Dmitry
    Prokhorov, Artem
    COMPUTATIONAL DATA AND SOCIAL NETWORKS, CSONET 2021, 2021, 13116 : 387 - 388