A comparative analysis of machine learning techniques for imbalanced data

被引:1
|
作者
Mrad, Ali Ben [1 ,2 ]
Lahiani, Amine [3 ,4 ,5 ]
Mefteh-Wali, Salma [6 ]
Mselmi, Nada [7 ]
机构
[1] Qassim Univ, Coll Comp, Dept Comp Sci, Buraydah, Saudi Arabia
[2] Univ Sfax, CES Lab, ENIS, Sfax, Tunisia
[3] LEO Lab Econ Orleans, Orleans, France
[4] Gulf Univ Sci & Technol, Ctr Sustainable Dev, Kuwait, Kuwait
[5] South Ural State Univ, Chelyabinsk, Russia
[6] ESSCA Sch Management, Angers, France
[7] Paris Saclay Univ, RITM, Paris, France
关键词
Bank inactivity; Classification; Machine learning; MULTIVARIATE STATISTICAL-ANALYSIS; BANK FAILURE; BANKRUPTCY PREDICTION; INVESTOR SENTIMENT; FINANCIAL RATIOS; NEURAL-NETWORK; DISTRESS; PERFORMANCE; INSOLVENCY; SECTOR;
D O I
10.1007/s10479-024-06018-0
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
This study compares the predictive accuracy of a set of machine learning models coupled with three resampling techniques (Random Undersampling, Random Oversampling, and Synthetic Minority Oversampling Technique) in predicting bank inactivity. Our sample includes listed banks in EU-28 member states between 2011 and 2019. We employed 23 financial ratios comprising capital adequacy, asset quality, management capability, earnings, liquidity, and sensitivity indicators. The empirical findings established that XGBoost performs exceptionally well as a classifier in predicting bank inactivity, particularly when considering a one-year time frame before the event. Furthermore, our findings indicate that random forest with Synthetic Minority Oversampling Technique demonstrates the highest predictive accuracy two years prior to inactivity, while XGBoost with Random Oversampling outperforms other methods three years in advance. Furthermore, the empirical results emphasize the significance of management capability and loan quality ratios as key factors in predicting bank inactivity. Our findings present important policy implications. Bank inactivity predictive accuracy of machine learning techniques with resampling techniques is analyzed.Data on banks in the EU-28 member states between 2011 and 2019 are used.XGBoost performs exceptionally well one year before inactivity.Random Forest with Synthetic Minority Oversampling is the best classifier two years before inactivity.XGBoost with Random Oversampling outperforms other methods three years before inactivity.
引用
收藏
页数:27
相关论文
共 50 条
  • [31] Adversarial Approaches to Tackle Imbalanced Data in Machine Learning
    Ayoub, Shahnawaz
    Gulzar, Yonis
    Rustamov, Jaloliddin
    Jabbari, Abdoh
    Reegu, Faheem Ahmad
    Turaev, Sherzod
    [J]. SUSTAINABILITY, 2023, 15 (09)
  • [32] Evolutionary Online Machine Learning from Imbalanced Data
    Stein, Anthony
    [J]. 2016 IEEE 1ST INTERNATIONAL WORKSHOPS ON FOUNDATIONS AND APPLICATIONS OF SELF* SYSTEMS (FAS*W), 2016, : 281 - 286
  • [33] Machine-learning classifiers for imbalanced tornado data
    Trafalis T.B.
    Adrianto I.
    Richman M.B.
    Lakshmivarahan S.
    [J]. Computational Management Science, 2014, 11 (4) : 403 - 418
  • [34] An Improved Extreme Learning Machine for Imbalanced Data Classification
    Zhang, Xiaopeng
    Qin, Liangxi
    [J]. IEEE ACCESS, 2022, 10 : 8634 - 8642
  • [35] A machine learning method for incomplete and imbalanced medical data
    Salman, Issam
    Vomlel, Jiri
    [J]. PROCEEDINGS OF THE 20TH CZECH-JAPAN SEMINAR ON DATA ANALYSIS AND DECISION MAKING UNDER UNCERTAINTY, 2017, : 188 - 195
  • [36] Machine Learning Techniques in Web Content Mining: A Comparative Analysis
    Anami, Basavaraj S.
    Wadawadagi, Ramesh S.
    Pagi, Veerappa B.
    [J]. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2014, 13 (01)
  • [37] Comparative Analysis of Machine Learning Techniques for Island Heightmap Generation
    Demergis, Dimitri
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [38] Mortality Prediction using Machine Learning Techniques: Comparative Analysis
    Verma, Akash
    Goyal, Shreya
    Thakur, Shridhar Kumar
    Gupta, Archit
    Gupta, Indrajeet
    [J]. PROCEEDINGS OF THE 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC 2019), 2019, : 230 - 234
  • [39] Comparative Analysis of Supervised Machine Learning Techniques for Sales Forecasting
    Raizada, Stuti
    Saini, Jatinderkumar R.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (11) : 102 - 110
  • [40] Comparative study on sentimental analysis using machine learning techniques
    Enduri, Murali Krishna
    Sangi, Abdur Rashid
    Anamalamudi, Satish
    Manikanta, R. Chandu Badrinath
    Reddy, K. Yogeshvar
    Yeswanth, P. Lovely
    Reddy, S. Kiran Sai
    Karthikeya, Asish
    [J]. MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2023, 42 (01) : 207 - 215