A hybrid framework for optimal feature subset selection

被引:27
|
作者
Shukla, Alok Kumar [1 ]
Singh, Pradeep [1 ]
Vardhan, Manu [1 ]
机构
[1] NIT Raipur, Dept Comp Sci & Engn, Chhattisgarh 492010, CG, India
关键词
Data mining; genetic algorithm; conditional mutual information maximization; feature selection; MUTUAL INFORMATION; MICROARRAY DATA; GENE SELECTION; CANCER CLASSIFICATION; ALGORITHM; FILTER; OPTIMIZATION; PREDICTION; ENTROPY;
D O I
10.3233/JIFS-169936
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the context of optimal subset selection, hybrid feature selection approaches play a significant role that has been the topic of a substantial number of studies because of the growing need for data mining applications. In feature subset selection (FSS) problem; there are two significant shortcomings that need to be addressed: At first, finding a suitable filter method that can be reasonably fast and energetically computed for large volume of data, and second, an efficient wrapper strategy that can discriminate the samples over the entire search space in a considerable amount of time. After a study of the shortcomings of individual feature selection methods (filter or wrapper), this paper investigated a new hybrid feature selection approach with conjunction of filter and wrapper method that can take benefit of both ways for a classification problem. The proposed hybrid uses the filter method as conditional mutual information maximization and wrapper method as genetic algorithm to enhance the overall classification performance and speed up the search process to identify the essential features. The proposed method is known as FWFSS. To get rid of meaningless features and determine the biomarkers, wrapper method as genetic algorithm uses the naive Bayes (NB) classifier as a fitness function. The proposed method is verified on the University of California, Irvine (UCI) repository, and microarray datasets. From experimental study, it is observed that our approach outperforms convenient methods regarding classification accuracy, the number of optimal features reported in the literature.
引用
收藏
页码:2247 / 2259
页数:13
相关论文
共 50 条
  • [1] A Hybrid Approach for Optimal Feature Subset Selection with Evolutionary Algorithms
    Kawamura, Atsushi
    Chakraborty, Basabi
    [J]. 2017 IEEE 8TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST), 2017, : 564 - 568
  • [2] Towards an optimal feature subset selection
    Shiba, OA
    Saeed, W
    Sulaiman, MN
    Ahmad, F
    Mamat, A
    [J]. SCORED 2003: STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT, PROCEEDINGS: NETWORKING THE FUTURE MIND IN CONVERGENCE TECHNOLOGY, 2003, : 376 - 380
  • [3] Optimal and Novel Hybrid Feature Selection Framework for Effective Data Classification
    Venkataraman, Sivakumar
    Selvaraj, Rajalakshmi
    [J]. ADVANCES IN SYSTEMS, CONTROL AND AUTOMATION, 2018, 442 : 499 - 514
  • [4] A general framework for boosting feature subset selection algorithms
    Perez-Rodriguez, Javier
    de Haro-Garcia, Aida
    Romero del Castillo, Juan A.
    Garcia-Pedrajas, Nicolas
    [J]. INFORMATION FUSION, 2018, 44 : 147 - 175
  • [5] A New Hybrid Feature Subset Selection Framework Based on Binary Genetic Algorithm and Information Theory
    Shukla, Alok Kumar
    Singh, Pradeep
    Vardhan, Manu
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2019, 18 (03)
  • [6] Optimal feature subset selection using hybrid binary Jaya optimization algorithm for text classification
    Thirumoorthy, K.
    Muneeswaran, K.
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2020, 45 (01):
  • [7] Optimal feature subset selection using hybrid binary Jaya optimization algorithm for text classification
    K Thirumoorthy
    K Muneeswaran
    [J]. Sādhanā, 2020, 45
  • [8] Hybrid Feature Selection Method Based on Feature Subset and Factor Analysis
    Gong, Lizeng
    Xie, Shanshan
    Zhang, Yan
    Wang, Mengyao
    Wang, Xiaoyan
    [J]. IEEE ACCESS, 2022, 10 : 120792 - 120803
  • [9] Improved Hybrid Feature Selection Framework
    Liao, Weizhi
    Ye, Guanglei
    Yan, Weijun
    Ma, Yaheng
    Zuo, Dongzhou
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (08): : 1266 - 1273
  • [10] Optimal Feature Selection using Fuzzy Combination of Feature Subset for Transcriptome Data
    Singh, Vikas
    Vardhan, Harsh
    Verma, Nishchal K.
    Cui, Yan
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2018,