Bankruptcy prediction using machine learning models with the text-based communicative value of annual reports

被引:13
|
作者
Chen, Tsung-Kang [1 ,3 ]
Liao, Hsien-Hsing [2 ]
Chen, Geng-Dao [1 ]
Kang, Wei-Han [1 ]
Lin, Yu-Chun [1 ]
机构
[1] Natl Yang Ming Chiao Tung Univ, Dept Management Sci, Hsinchu, Taiwan
[2] Natl Taiwan Univ, Dept Finance, New Taipei, Taiwan
[3] Natl Taiwan Univ, Ctr Res Econometr Theory & Applicat, New Taipei, Taiwan
关键词
Annual report text-based communicative value; Bankruptcy prediction; Machine learning; Credit risk; Incomplete information; ANNUAL-REPORT READABILITY; FINANCIAL RATIOS; COMPLEXITY; DISCLOSURE; EARNINGS; IMPACT; FOG;
D O I
10.1016/j.eswa.2023.120714
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate whether including the text-based communicative value of annual report increases the predictive power of four machine learning models (Logistic regression, Random Forest, XGBoost, and Support Vector Machine) for corporate bankruptcy prediction using U.S. firm observations from 1994 to 2018. We find that the overall prediction effectiveness of these four models (e.g. accuracy, F1-score, AUCs) significantly improves, especially true in the performance of XGBoost and Random Forest models. In addition, we find that annual report text-based communicative value variables significantly reduce models' Type II error and keep the Type I error at a relatively small level, especially for the short-term bankruptcy forecast. The results reveal that annual report text-based communicative value effectively mitigates the model misidentification of a non-bankrupt firm as a bankrupt firm. Our results also suggest that annual report text-based communicative value is helpful for bank's corporate loan underwriting decisions. Finally, our findings still hold when considering different testing periods and random state settings, replacing by another publicly available bankruptcy dataset, and introducing neural network models.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Stock Market Prediction using Text-based Machine Learning
    Jordan, Tristan
    Elgazzar, Heba
    [J]. 2020 IEEE INTERNATIONAL IOT, ELECTRONICS AND MECHATRONICS CONFERENCE (IEMTRONICS 2020), 2020, : 322 - 326
  • [2] Word Categorization of Corporate Annual Reports for Bankruptcy Prediction by Machine Learning Methods
    Hajek, Petr
    Olej, Vladimir
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 122 - 130
  • [3] Machine learning in bank merger prediction: A text-based approach 
    Katsafados, Apostolos G.
    Leledakis, George N.
    Pyrgiotakis, Emmanouil G.
    Androutsopoulos, Ion
    Fergadiotis, Manos
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 312 (02) : 783 - 797
  • [4] Machine learning models and bankruptcy prediction
    Barboza, Flavio
    Kimura, Herbert
    Altman, Edward
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 83 : 405 - 417
  • [5] An empirical study of text-based machine learning models for vulnerability detection
    Kollin Napier
    Tanmay Bhowmik
    Shaowei Wang
    [J]. Empirical Software Engineering, 2023, 28
  • [6] An empirical study of text-based machine learning models for vulnerability detection
    Napier, Kollin
    Bhowmik, Tanmay
    Wang, Shaowei
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (02)
  • [7] Text-Based Machine Learning Models for Cross-Domain Vulnerability Prediction: Why They may not be Effective?
    Napier, Kollin
    Bhowmik, Tanmay
    [J]. 2022 IEEE 23RD INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2022), 2022, : 158 - 163
  • [8] Explainability of Machine Learning Models for Bankruptcy Prediction
    Park, Min Sue
    Son, Hwijae
    Hyun, Chongseok
    Hwang, Hyung Ju
    [J]. IEEE ACCESS, 2021, 9 : 124887 - 124899
  • [9] Text-Based Industry Classification Based on Chinese A Share Annual Reports
    Cao, Mengxin
    [J]. 2021 2ND INTERNATIONAL CONFERENCE ON BIG DATA & ARTIFICIAL INTELLIGENCE & SOFTWARE ENGINEERING (ICBASE 2021), 2021, : 701 - 704
  • [10] Explaining poor performance of text-based machine learning models for vulnerability detection
    Napier, Kollin
    Bhowmik, Tanmay
    Chen, Zhiqian
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2024, 29 (05)