New hybrid data mining model for credit scoring based on feature selection algorithm and ensemble classifiers

被引:38
|
作者
Nalic, Jasmina [1 ]
Martinovic, Goran [1 ]
Zagar, Drago [1 ]
机构
[1] JJ Strossmayer Univ Osijek, Fac Elect Engn Comp Sci & Informat Technol Osijek, KnezaTrpimira 2b, Osijek 31000, Croatia
关键词
Credit scoring; Data mining; Ensemble classifier; Feature selection; Hybrid model;
D O I
10.1016/j.aei.2020.101130
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The aim of this paper is to propose a new hybrid data mining model based on combination of various feature selection and ensemble learning classification algorithms, in order to support decision making process. The model is built through several stages. In the first stage, initial dataset is preprocessed and apart of applying different preprocessing techniques, we paid a great attention to the feature selection. Five different feature selection algorithms were applied and their results, based on ROC and accuracy measures of logistic regression algorithm, were combined based on different voting types. We also proposed a new voting method, called if-any, that outperformed all other voting methods, as well as a single feature selection algorithm's results. In the next stage, a four different classification algorithms, including generalized linear model, support vector machine, naive Bayes and decision tree, were performed based on dataset obtained in the feature selection process. These classifiers were combined in eight different ensemble models using soft voting method. Using the real dataset, the experimental results show that hybrid model that is based on features selected by if-any voting method and ensemble GLM + DT model performs the highest performance and outperforms all other ensemble and single classifier models.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] A local binary social spider algorithm for feature selection in credit scoring model
    Zhang, Zaimei
    Li, Yitan
    Liu, Yan
    Liu, Siming
    APPLIED SOFT COMPUTING, 2023, 144
  • [22] Feature selection based on SVM for credit scoring
    Yao, Ping
    PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND NATURAL COMPUTING, VOL II, 2009, : 44 - 47
  • [23] Building a Credit Scoring Model Based on Data Mining Approaches
    Nalic, Jasmina
    Martinovic, Goran
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2020, 30 (02) : 147 - 169
  • [24] A New Hybrid Support Vector Machine Ensemble Classification Model for Credit Scoring
    Yao, Jian-Rong
    Chen, Jia-Rui
    JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2019, 12 (01) : 77 - 88
  • [25] A New Ensemble Model for Phishing Detection Based on Hybrid Cumulative Feature Selection
    Prince, Md Sirajum Munir
    Hasan, Asib
    Shah, Faisal Muhammad
    11TH IEEE SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS (ISCAIE 2021), 2021, : 7 - 12
  • [26] An Efficient Multi-layer Ensemble Framework with BPSOGSA-Based Feature Selection for Credit Scoring Data Analysis
    Damodar Reddy Edla
    Diwakar Tripathi
    Ramalingaswamy Cheruku
    Venkatanareshbabu Kuppili
    Arabian Journal for Science and Engineering, 2018, 43 : 6909 - 6928
  • [27] An Efficient Multi-layer Ensemble Framework with BPSOGSA-Based Feature Selection for Credit Scoring Data Analysis
    Edla, Damodar Reddy
    Tripathi, Diwakar
    Cheruku, Ramalingaswamy
    Kuppili, Venkatanareshbabu
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2018, 43 (12) : 6909 - 6928
  • [28] Credit Risk Evaluation Based on Data Mining and Integrated Feature Selection
    Deng, Yuanjie
    Wei, Ying
    Li, Yujun
    2020 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2020), 2020,
  • [29] A hybrid model with novel feature selection method and enhanced voting method for credit scoring
    Yao, Jianrong
    Wang, Zhongyi
    Wang, Lu
    Zhang, Zhebin
    Jiang, Hui
    Yan, Surong
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (03) : 2565 - 2579
  • [30] Comparison of the hybrid Credit scoring models based on Various Classifiers
    Chen, Fei-Long
    Li, Feng-Chia
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2010, 6 (03) : 56 - 74