Stopping rules for mutual information-based feature selection

被引:18
|
作者
Mielniczuk, Jan [1 ,2 ]
Teisseyre, Pawel [1 ]
机构
[1] Polish Acad Sci, Inst Comp Sci, Jana Kazimierza 5, PL-01248 Warsaw, Poland
[2] Warsaw Univ Technol, Fac Math & Informat Sci, Warsaw, Poland
关键词
Entropy; Mutual information; Interaction information; Feature selection; Multiple hypothesis testing; Stopping rules; REGRESSION; FRAMEWORK;
D O I
10.1016/j.neucom.2019.05.048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years feature selection methods based on mutual information have attracted a significant attention. Most of the proposed methods are based on sequential forward search which add at each step a feature which is most relevant in explaining a class variable when considered together with the already chosen features. Such procedures produce ranking of features ordered according to their relevance. However significant limitation of all existing methods is lack of stopping rules which separate relevant features placed on the top of the ranking list from irrelevant ones. Finding an appropriate stopping rule is particularly important in domains where one wants to precisely determine the set of features affecting the class variable and discard the irrelevant ones (e.g. in genome-wide association studies the goal is to precisely determine mutations in DNA affecting the disease). In this work we propose stopping rules which are based on distribution of approximation of conditional mutual information given that all relevant features have been already selected. We show that the distribution is approximately chi square with appropriate number of degrees of freedom provided features are discretized into moderate number of bins. The proposed stopping rules are based on quantiles of the distribution and related p-values which are compared with thresholds used in multiple hypothesis testing. Importantly the proposed methods do not require additional validation data and are independent from the classifier. The extensive simulation experiments indicate that the rules separate relevant features from the irrelevant ones. We show experimentally that Positive Selection Rate (fraction of features correctly selected as relevant with respect to all relevant features) approaches 1, when sample size increases. At the same time, False Discovery Rate (fraction of irrelevant features selected with respect to all selected features) is controlled. The experiments on 17 benchmark datasets indicate that the classification models, built on features selected by the proposed methods, in 13 cases achieve significantly higher accuracy than the models based on all available features. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:255 / 274
页数:20
相关论文
共 50 条
  • [1] Mutual information-based feature selection for radiomics
    Oubel, Estanislao
    Beaumont, Hubert
    Iannessi, Antoine
    [J]. MEDICAL IMAGING 2016: PACS AND IMAGING INFORMATICS: NEXT GENERATION AND INNOVATIONS, 2016, 9789
  • [2] Mutual information-based feature selection for multilabel classification
    Doquire, Gauthier
    Verleysen, Michel
    [J]. NEUROCOMPUTING, 2013, 122 : 148 - 155
  • [3] A Study on Mutual Information-Based Feature Selection in Classifiers
    Arundhathi, B.
    Athira, A.
    Rajan, Ranjidha
    [J]. ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2016, 2017, 517 : 479 - 486
  • [4] CONDITIONAL DYNAMIC MUTUAL INFORMATION-BASED FEATURE SELECTION
    Liu, Huawen
    Mo, Yuchang
    Zhao, Jianmin
    [J]. COMPUTING AND INFORMATICS, 2012, 31 (06) : 1193 - 1216
  • [5] Early Stopping for Mutual Information Based Feature Selection
    Beinrucker, Andre
    Dogan, Ueruen
    Blanchard, Gilles
    [J]. 2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 975 - 978
  • [6] Feature redundancy term variation for mutual information-based feature selection
    Wanfu Gao
    Liang Hu
    Ping Zhang
    [J]. Applied Intelligence, 2020, 50 : 1272 - 1288
  • [7] Feature redundancy term variation for mutual information-based feature selection
    Gao, Wanfu
    Hu, Liang
    Zhang, Ping
    [J]. APPLIED INTELLIGENCE, 2020, 50 (04) : 1272 - 1288
  • [8] Study on mutual information-based feature selection for text categorization
    Xu, Yan
    Jones, Gareth
    Li, Jintao
    Wang, Bin
    Sun, Chunming
    [J]. Journal of Computational Information Systems, 2007, 3 (03): : 1007 - 1012
  • [9] Mutual information-based feature selection for intrusion detection systems
    Amiri, Fatemeh
    Yousefi, MohammadMahdi Rezaei
    Lucas, Caro
    Shakery, Azadeh
    Yazdani, Nasser
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2011, 34 (04) : 1184 - 1199
  • [10] Mutual Information-Based Feature Selection and Ensemble Learning for Classification
    Qi, Chengming
    Zhou, Zhangbing
    Wang, Qun
    Hu, Lishuan
    [J]. 2016 INTERNATIONAL CONFERENCE ON IDENTIFICATION, INFORMATION AND KNOWLEDGE IN THE INTERNET OF THINGS (IIKI), 2016, : 116 - 121