Impact of Feature Selection Methods on the Perfromance of Credit Risk Classification Algorithms

被引:0
|
作者
Singh, N. P. [1 ]
Singh, Devender [2 ]
机构
[1] Management Dev Inst, Informat Management Area, Gurgaon, Gurugram, India
[2] AIMA AMU Phd Program, Aligarh, Uttar Pradesh, India
关键词
Feature Selection; Chi-Square; Gain Ratio; Information Gain; Relief F; symmetric uncertainty;
D O I
10.1109/aict47866.2019.8981771
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents ensembling of filter features selection algorithms for classification problem in the context of assessment of risk of credit for a financial institution. Feature selection is one of the most important aspect of data mining, and machine learning algorithm. The main objective of feature selection is reduction in computing resources, reduction in future data collection cost, reducing complexities of the model, avoiding overfitting, and increasing the performance of machine learning algorithms. In this paper the set of available variables are firstly reduced using filter feature selection methods such as chi-square, gain ratio, information gain, relief F, and symmetric uncertainly. In addition, ensemble feature selection of the input variables based on these individual methods is also used. The impact of feature selection is measured by fitting seven classification algorithms, i.e., Random Forest, C4.5, PART, C5.0, Bagging, Boosting, and MINI Linear. The performance of the models is compared by calculating parameters such as accuracy, sensitivity, specificity, positive predictive values, negatively predictive values, and AUC. The data used is German bank data of 1000 records and 20 features and one target variable
引用
收藏
页码:101 / 106
页数:6
相关论文
共 50 条
  • [21] Feature selection algorithms to improve documents' classification performance
    Sousa, PAC
    Pimentao, JP
    Santos, BRD
    Moura-Pires, F
    [J]. ADVANCES IN WEB INTELLIGENCE, 2003, 2663 : 288 - 296
  • [22] Feature selection algorithms in classification problems: an experimental evaluation
    Salappa, A.
    Doumpos, M.
    Zopounidis, C.
    [J]. OPTIMIZATION METHODS & SOFTWARE, 2007, 22 (01): : 199 - 214
  • [23] Toward integrating feature selection algorithms for classification and clustering
    Liu, H
    Yu, L
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (04) : 491 - 502
  • [24] Adapting Feature Selection Algorithms for the Classification of Chinese Texts
    Liu, Xuan
    Wang, Shuang
    Lu, Siyu
    Yin, Zhengtong
    Li, Xiaolu
    Yin, Lirong
    Tian, Jiawei
    Zheng, Wenfeng
    [J]. SYSTEMS, 2023, 11 (09):
  • [25] Comparison on Feature Selection Methods for Text Classification
    Liu, Wenkai
    Xiao, Jiongen
    Hong, Ming
    [J]. 2020 THE 4TH INTERNATIONAL CONFERENCE ON MANAGEMENT ENGINEERING, SOFTWARE ENGINEERING AND SERVICE SCIENCES (ICMSS 2020), 2020, : 82 - 86
  • [26] Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction
    Noroozi, Zeinab
    Orooji, Azam
    Erfannia, Leila
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [27] Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction
    Zeinab Noroozi
    Azam Orooji
    Leila Erfannia
    [J]. Scientific Reports, 13
  • [28] Empirical evaluation of feature selection methods in classification
    Cehovin, Luka
    Bosnic, Zoran
    [J]. INTELLIGENT DATA ANALYSIS, 2010, 14 (03) : 265 - 281
  • [29] Feature Subset Selection for Fuzzy Classification Methods
    Cintra, Marcos E.
    Camargo, Heloisa A.
    [J]. INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS: THEORY AND METHODS, PT 1, 2010, 80 : 318 - +
  • [30] Analysis of flight delays in aviation system using different classification algorithms and feature selection methods
    Anderson, A. B. A.
    Kumar, A. J. Sanjeev
    Christopher, A. B. Arockia
    [J]. AERONAUTICAL JOURNAL, 2019, 123 (1267): : 1415 - 1436