Tree-based heterogeneous cascade ensemble model for credit scoring

被引:12
|
作者
Liu, Wanan [1 ]
Fan, Hong [1 ]
Xia, Meng [2 ]
机构
[1] Donghua Univ, Glorious Sun Sch Business & Management, Shanghai 200051, Peoples R China
[2] Donghua Univ, Coll Informat Sci & Technol, Shanghai, Peoples R China
基金
中国国家自然科学基金; 上海市自然科学基金;
关键词
Credit scoring; Ensemble algorithm; Heterogeneous deep forest; Weighted voting mechanism; Interpretability; ART CLASSIFICATION ALGORITHMS; BANKRUPTCY PREDICTION; FEATURE-SELECTION; IMPACT; PERFORMANCE; MACHINES;
D O I
10.1016/j.ijforecast.2022.07.007
中图分类号
F [经济];
学科分类号
02 ;
摘要
Credit scoring is an important tool to guard against commercial risks for banks and lending companies and provides good conditions for the construction of individual personal credit. Ensemble algorithms have shown appealing progress for the improvement of credit scoring. In this study, to meet the challenge of large-scale credit scoring, we propose a heterogeneous deep forest model (Heter-DF), which is established based on considerations ranging from base learner selection, encouragement of the diversity of base learners, and ensemble strategies, for credit scoring. Heter-DF is designed as a scalable cascading framework that can increase its complexity with the scale of the credit dataset. Moreover, each level of Heter-DF is built by multiple heterogeneous tree-based ensembled base learners, avoiding the homogeneous prediction of the ensemble framework. In addition, a weighted voting mechanism is introduced to highlight important information and suppress irrelevant features, making Heter-DF a robust model for credit scoring. Experimental results on four credit scoring datasets and six evaluation metrics show that the cascading framework a good choice for the ensemble of tree-based base learners. A comparison among homogeneous ensembles and heterogeneous ensembles further demonstrates the effectiveness of Heter-DF. Experiments on different training sets indicate that Heter-DF is a scalable framework which not only deals with large-scale credit scoring but also satisfies the condition where small-scale credit scoring is desirable. Finally, based on the good interpretability of a tree-based structure, the global interpretation of Heter-DF is preliminarily explored. (c) 2022 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:1593 / 1614
页数:22
相关论文
共 50 条
  • [41] An ensemble tree-based machine learning model for predicting the uniaxial compressive strength of travertine rocks
    Rahim Barzegar
    Masoud Sattarpour
    Ravinesh Deo
    Elham Fijani
    Jan Adamowski
    Neural Computing and Applications, 2020, 32 : 9065 - 9080
  • [42] Tree-based ensemble machine learning model for nitrate reduction by zero-valent iron
    Istiqomah, Nurul Alvia
    Jung, Donghwi
    Khim, Jeehyeong
    JOURNAL OF WATER PROCESS ENGINEERING, 2023, 56
  • [43] A Hybrid Tree-Based Ensemble Learning Model for Day-Ahead Peak Load Forecasting
    Moon, Jihoon
    Park, Sungwoo
    Hwang, Eenjun
    Rho, Seungmin
    2022 15TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION (HSI), 2022,
  • [44] A Heterogeneous Ensemble Learning Model Based on Data Distribution for Credit Card Fraud Detection
    Xie, Yalong
    Li, Aiping
    Gao, Liqun
    Liu, Ziniu
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
  • [45] SleepBoost: a multi-level tree-based ensemble model for automatic sleep stage classification
    Zaman, Akib
    Kumar, Shiu
    Shatabda, Swakkhar
    Dehzangi, Imam
    Sharma, Alok
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2024, 62 (09) : 2769 - 2783
  • [46] Credit Fraud Detection Based on Hybrid Credit Scoring Model
    Chen, Keqin
    Yadav, Amit
    Khan, Asif
    Zhu, Kun
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 : 2 - 8
  • [47] Feature Scoring using Tree-Based Ensembles for Evolving Data Streams
    Gomes, Heitor Murilo
    de Mello, Rodrigo Fernandes
    Pfahringer, Bernhard
    Bifet, Albert
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 761 - 769
  • [48] CPRtree: A tree-based checkpointing architecture for heterogeneous FPGA computing
    1600, Institute of Electrical and Electronics Engineers Inc., United States
  • [49] Two credit scoring models based on dual strategy ensemble trees
    Wang, Gang
    Ma, Jian
    Huang, Lihua
    Xu, Kaiquan
    KNOWLEDGE-BASED SYSTEMS, 2012, 26 : 61 - 68
  • [50] Credit Scoring Using Ensemble Classification Based on Variable Weighting Clustering
    Ding, Haiyang
    Zhang, Peng
    Lu, Tun
    Gu, Hansu
    Gu, Ning
    2017 IEEE 21ST INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2017, : 509 - 514