A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring

被引：449

作者：

Xia, Yufei ^{[1
]}

Liu, Chuanzhe ^{[1
]}

Li, YuYing ^{[2
]}

Liu, Nana ^{[1
]}

机构：

[1] China Univ Min & Technol, Sch Management, Xuzhou 221116, Jiangsu, Peoples R China

[2] China Univ Min & Technol, Sch Foreign Studies, Xuzhou 221116, Jiangsu, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2017年 / 78卷

关键词：

Credit scoring; Boosted decision tree; Bayesian hyper-parameter optimization; ART CLASSIFICATION ALGORITHMS; BANKRUPTCY PREDICTION; RISK-ASSESSMENT; ENSEMBLE; CLASSIFIERS; REGRESSION; MACHINE; MODELS; FOREST;

D O I：

10.1016/j.eswa.2017.02.017

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Credit scoring is an effective tool for banks to properly guide decision profitably on granting loans. Ensemble methods, which according to their structures can be divided into parallel and sequential ensembles, have been recently developed in the credit scoring domain. These methods have proven their superiority in discriminating borrowers accurately. However, among the ensemble models, little consideration has been provided to the following: (1) highlighting the hyper-parameter tuning of base learner despite being critical to well-performed ensemble models; (2) building sequential models (i.e., boosting, as most have focused on developing the same or different algorithms in parallel); and (3) focusing on the comprehensibility of models. This paper aims to propose a sequential ensemble credit scoring model based on a variant of gradient boosting machine (i.e., extreme gradient boosting (XGBoost)). The model mainly comprises three steps. First, data pre-processing is employed to scale the data and handle missing values. Second, a model-based feature selection system based on the relative feature importance scores is utilized to remove redundant variables. Third, the hyper-parameters of XGBoost are adaptively tuned with Bayesian hyper-parameter optimization and used to train the model with selected feature subset. Several hyper-parameter optimization methods and baseline classifiers are considered as reference points in the experiment. Results demonstrate that Bayesian hyper-parameter optimization performs better than random search, grid search, and manual search. Moreover, the proposed model outperforms baseline models on average over four evaluation measures: accuracy, error rate, the area under the curve (AUC) H measure (AUC-H measure), and Brier score. The proposed model also provides feature importance scores and decision chart, which enhance the interpretability of credit scoring model. (C) 2017 Elsevier Ltd. All rights reserved.

引用

页码：225 / 241

页数：17

共 50 条

[21] A Hyper-parameter Inference for Radon Transformed Image Reconstruction Using Bayesian Inference
Shouno, Hayaru
Okada, Masato
[J]. MACHINE LEARNING IN MEDICAL IMAGING, 2010, 6357 : 26 - +
[22] Classification complexity assessment for hyper-parameter optimization
Cai, Ziyun
Long, Yang
Shao, Ling
[J]. PATTERN RECOGNITION LETTERS, 2019, 125 : 396 - 403
[23] A New Baseline for Automated Hyper-Parameter Optimization
Geitle, Marius
Olsson, Roland
[J]. MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, 2019, 11943 : 521 - 530
[24] Hippo: Sharing Computations in Hyper-Parameter Optimization
Shin, Ahnjae
Jeong, Joo Seong
Kim, Do Yoon
Jung, Soyoung
Chun, Byung-Gon
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (05): : 1038 - 1052
[25] A New Approach Towards the Combined Algorithm Selection and Hyper-parameter Optimization Problem
Guo, Xin
van Stein, Bas
Back, Thomas
[J]. 2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 2042 - 2049
[26] AME: Attention and Memory Enhancement in Hyper-Parameter Optimization
Xu, Nuo
Chang, Jianlong
Nie, Xing
Huo, Chunlei
Xiang, Shiming
Pan, Chunhong
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 480 - 489
[27] An efficient hyper-parameter optimization method for supervised learning
Shi, Ying
Qi, Hui
Qi, Xiaobo
Mu, Xiaofang
[J]. APPLIED SOFT COMPUTING, 2022, 126
[28] Generating Pool of Classifiers with Hyper-Parameter Optimization for Ensemble
Wang, Qiushi
Chan, Hian-Leng
[J]. IECON 2021 - 47TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2021,
[29] CNN hyper-parameter optimization for environmental sound classification
Inik, Ozkan
[J]. APPLIED ACOUSTICS, 2023, 202
[30] RHOASo: An Early Stop Hyper-Parameter Optimization Algorithm
Munoz Castaneda, Angel Luis
DeCastro-Garcia, Noemi
Escudero Garcia, David
[J]. MATHEMATICS, 2021, 9 (18)

← 1 2 3 4 5 →