Ensemble-Based Machine Learning Algorithm for Loan Default Risk Prediction

被引:0
|
作者
Akinjole, Abisola [1 ]
Shobayo, Olamilekan [1 ]
Popoola, Jumoke [1 ]
Okoyeigbo, Obinna [2 ]
Ogunleye, Bayode [3 ]
机构
[1] Sheffield Hallam Univ, Dept Comp, Sheffield S1 2NU, England
[2] Edge Hill Univ, Dept Psychol, Ormskirk L39 4QP, England
[3] Univ Brighton, Dept Comp & Math, Brighton BN2 4GJ, England
关键词
credit default prediction; deep learning; ensemble learning; machine learning; CREDIT; NETWORK; TREES; SMOTE;
D O I
10.3390/math12213423
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Predicting credit default risk is important to financial institutions, as accurately predicting the likelihood of a borrower defaulting on their loans will help to reduce financial losses, thereby maintaining profitability and stability. Although machine learning models have been used in assessing large applications with complex attributes for these predictions, there is still a need to identify the most effective techniques for the model development process, including the technique to address the issue of data imbalance. In this research, we conducted a comparative analysis of random forest, decision tree, SVMs (Support Vector Machines), XGBoost (Extreme Gradient Boosting), ADABoost (Adaptive Boosting) and the multi-layered perceptron, to predict credit defaults using loan data from LendingClub. Additionally, XGBoost was used as a framework for testing and evaluating various techniques. Moreover, we applied this XGBoost framework to handle the issue of class imbalance observed, by testing various resampling methods such as Random Over-Sampling (ROS), the Synthetic Minority Over-Sampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), Random Under-Sampling (RUS), and hybrid approaches like the SMOTE with Tomek Links and the SMOTE with Edited Nearest Neighbours (SMOTE + ENNs). The results showed that balanced datasets significantly outperformed the imbalanced dataset, with the SMOTE + ENNs delivering the best overall performance, achieving an accuracy of 90.49%, a precision of 94.61% and a recall of 92.02%. Furthermore, ensemble methods such as voting and stacking were employed to enhance performance further. Our proposed model achieved an accuracy of 93.7%, a precision of 95.6% and a recall of 95.5%, which shows the potential of ensemble methods in improving credit default predictions and can provide lending platforms with the tool to reduce default rates and financial losses. In conclusion, the findings from this study have broader implications for financial institutions, offering a robust approach to risk assessment beyond the LendingClub dataset.
引用
收藏
页数:31
相关论文
共 50 条
  • [31] Dynamic ensemble-based machine learning models for predicting pest populations
    Singh, Ankit Kumar
    Yeasin, Md
    Paul, Ranjit Kumar
    Paul, A. K.
    Sarkar, Anita
    FRONTIERS IN APPLIED MATHEMATICS AND STATISTICS, 2024, 10
  • [32] An Ensemble-Based Machine Learning Model for Forecasting Network Traffic in VANET
    Amiri, Parvin Ahmadi Doval
    Pierre, Samuel
    IEEE ACCESS, 2023, 11 : 22855 - 22870
  • [33] An Ensemble-Based Machine Learning Model for Emotion and Mental Health Detection
    Jonnalagadda, Annapurna
    Rajvir, Manan
    Singh, Shovan
    Chandramouliswaran, S.
    George, Joshua
    Kamalov, Firuz
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2023, 22 (02)
  • [34] Enhancing Telemarketing Success Using Ensemble-Based Online Machine Learning
    Kaisar, Shahriar
    Rashid, Md Mamunur
    Chowdhury, Abdullahi
    Shafin, Sakib Shahriar
    Kamruzzaman, Joarder
    Diro, Abebe
    BIG DATA MINING AND ANALYTICS, 2024, 7 (02): : 294 - 314
  • [35] Application of Ensemble-Based Machine Learning Models to Landslide Susceptibility Mapping
    Kadavi, Prima Riza
    Lee, Chang-Wook
    Lee, Saro
    REMOTE SENSING, 2018, 10 (08)
  • [36] Prediction of energy content of biomass based on hybrid machine learning ensemble algorithm
    Dodo, Usman Alhaji
    Ashigwuike, Evans Chinemezu
    Emechebea, Jonas Nwachukwu
    Abbac, Sani Isah
    ENERGY NEXUS, 2022, 8
  • [37] An Ensemble-based Supervised Machine Learning Framework for Android Ransomware Detection
    Sharma, Shweta
    Challa, Rama Krishna
    Kumar, Rakesh
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2021, 18 (3A) : 422 - 429
  • [38] An Ensemble-Based Machine Learning for Predicting Fraud of Credit Card Transactions
    Baabdullah, Tahani
    Rawat, Danda B.
    Liu, Chunmei
    Alzahrani, Amani
    INTELLIGENT COMPUTING, VOL 2, 2022, 507 : 214 - 229
  • [39] Machine Learning and Deep Learning for Loan Prediction in Banking: Exploring Ensemble Methods and Data Balancing
    Sayed, Eslam Hussein
    Alabrah, Amerah
    Rahouma, Kamel Hussein
    Zohaib, Muhammad
    Badry, Rasha M.
    IEEE ACCESS, 2024, 12 : 193997 - 194019
  • [40] Loan Default Risk Prediction Using Knowledge Graph
    Alam, Md Nurul
    Ali, Muhammad Masroor
    2022-14TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST 2022), 2022, : 34 - 39