A comparative analysis of machine learning techniques for imbalanced data

被引:1
|
作者
Mrad, Ali Ben [1 ,2 ]
Lahiani, Amine [3 ,4 ,5 ]
Mefteh-Wali, Salma [6 ]
Mselmi, Nada [7 ]
机构
[1] Qassim Univ, Coll Comp, Dept Comp Sci, Buraydah, Saudi Arabia
[2] Univ Sfax, CES Lab, ENIS, Sfax, Tunisia
[3] LEO Lab Econ Orleans, Orleans, France
[4] Gulf Univ Sci & Technol, Ctr Sustainable Dev, Kuwait, Kuwait
[5] South Ural State Univ, Chelyabinsk, Russia
[6] ESSCA Sch Management, Angers, France
[7] Paris Saclay Univ, RITM, Paris, France
关键词
Bank inactivity; Classification; Machine learning; MULTIVARIATE STATISTICAL-ANALYSIS; BANK FAILURE; BANKRUPTCY PREDICTION; INVESTOR SENTIMENT; FINANCIAL RATIOS; NEURAL-NETWORK; DISTRESS; PERFORMANCE; INSOLVENCY; SECTOR;
D O I
10.1007/s10479-024-06018-0
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
This study compares the predictive accuracy of a set of machine learning models coupled with three resampling techniques (Random Undersampling, Random Oversampling, and Synthetic Minority Oversampling Technique) in predicting bank inactivity. Our sample includes listed banks in EU-28 member states between 2011 and 2019. We employed 23 financial ratios comprising capital adequacy, asset quality, management capability, earnings, liquidity, and sensitivity indicators. The empirical findings established that XGBoost performs exceptionally well as a classifier in predicting bank inactivity, particularly when considering a one-year time frame before the event. Furthermore, our findings indicate that random forest with Synthetic Minority Oversampling Technique demonstrates the highest predictive accuracy two years prior to inactivity, while XGBoost with Random Oversampling outperforms other methods three years in advance. Furthermore, the empirical results emphasize the significance of management capability and loan quality ratios as key factors in predicting bank inactivity. Our findings present important policy implications. Bank inactivity predictive accuracy of machine learning techniques with resampling techniques is analyzed.Data on banks in the EU-28 member states between 2011 and 2019 are used.XGBoost performs exceptionally well one year before inactivity.Random Forest with Synthetic Minority Oversampling is the best classifier two years before inactivity.XGBoost with Random Oversampling outperforms other methods three years before inactivity.
引用
收藏
页数:27
相关论文
共 50 条
  • [41] A Comparative Analysis of Machine Learning Techniques for Classification and Detection of Malware
    Al-Janabi, Maryam
    Altamimi, Ahmad Mousa
    [J]. 2020 21ST INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2020,
  • [42] Comparative analysis of machine learning techniques for the prediction of DMPK parameters
    White, Zollie, III
    Lowe, Edward W., Jr.
    Meiler, Jens
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 243
  • [43] A Comparative Analysis of Machine Learning Techniques for IoT Intrusion Detection
    Vitorino, Joao
    Andrade, Rui
    Praca, Isabel
    Sousa, Orlando
    Maia, Eva
    [J]. FOUNDATIONS AND PRACTICE OF SECURITY, FPS 2021, 2022, 13291 : 191 - 207
  • [44] A comparative analysis of machine learning techniques for student retention management
    Delen, Dursun
    [J]. DECISION SUPPORT SYSTEMS, 2010, 49 (04) : 498 - 506
  • [45] A Comparative Analysis of Machine Learning Techniques for Cyberbullying Detection on Twitter
    Muneer, Amgad
    Fati, Suliman Mohamed
    [J]. FUTURE INTERNET, 2020, 12 (11) : 1 - 21
  • [46] Comparative Analysis of Machine Learning Techniques Using Predictive Modeling
    Khandelwal, Ritu
    Goyal, Hemlata
    Shekhawat, Rajveer S.
    [J]. Recent Advances in Computer Science and Communications, 2022, 15 (03) : 466 - 477
  • [47] Comparative Analysis of Machine Learning Techniques in Assessing Cognitive Workload
    Elkin, Colin
    Devabhaktuni, Vijay
    [J]. ADVANCES IN NEUROERGONOMICS AND COGNITIVE ENGINEERING, 2020, 953 : 185 - 195
  • [48] Machine learning and data mining techniques for medical complex data analysis
    Alinejad-Rokny, Hamid
    Sadroddiny, Esmaeil
    Scaria, Vinod
    [J]. NEUROCOMPUTING, 2018, 276 : 1 - 1
  • [49] A comparative analysis of the automatic modeling of Learning Styles through Machine Learning techniques
    Ferreira, Lucas D.
    Spadon, Gabriel
    Carvalho, Andre C. P. L. F.
    Rodrigues-, Jose F., Jr.
    [J]. 2018 IEEE FRONTIERS IN EDUCATION CONFERENCE (FIE), 2018,
  • [50] Machine Learning and Deep Learning Techniques for Residential Load Forecasting: A Comparative Analysis
    Shabbir, Noman
    Kutt, Lauri
    Raja, Hadi A.
    Ahmadiahangar, Roya
    Rosin, Argo
    Husev, Oleksandr
    [J]. 2021 IEEE 62ND INTERNATIONAL SCIENTIFIC CONFERENCE ON POWER AND ELECTRICAL ENGINEERING OF RIGA TECHNICAL UNIVERSITY (RTUCON), 2021,