Mutual information criterion for feature selection from incomplete data

被引:60
|
作者
Qian, Wenbin [1 ,2 ]
Shu, Wenhao [3 ]
机构
[1] Jiangxi Agr Univ, Sch Software, Nanchang 330045, Peoples R China
[2] Beijing Key Lab Knowledge Engn Mat Sci, Beijing 100083, Peoples R China
[3] East China Jiaotong Univ, Sch Informat Engn, Nanchang 330013, Peoples R China
关键词
Feature selection; Uncertainty measure; Mutual information; Incomplete data; Rough sets; FEATURE SUBSET-SELECTION; ATTRIBUTE REDUCTION; MAX-DEPENDENCY; DISCRETIZATION; ALGORITHMS; RELEVANCE; SYSTEMS;
D O I
10.1016/j.neucom.2015.05.105
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is an important preprocessing step in machine learning and data mining, and feature criterion arises a key issue in the construction of feature selection algorithms. Mutual information is one of the widely used criteria in feature selection, which determines the relevance between features and target classes. Some mutual information-based feature selection algorithms have been extensively studied, but less effort has been made to investigate the feature selection issue in incomplete data. In this paper, combined with the tolerance information granules in rough sets, the mutual information criterion is provided for evaluating candidate features in incomplete data, which not only utilizes the largest mutual information with the target class but also takes into consideration the redundancy between selected features. We first validate the feasibility of the mutual information. Then an effective mutual information-based feature selection algorithm with forward greedy strategy is developed in incomplete data. To further accelerate the feature selection process, the selection of candidate features is implemented in a dwindling object set. Compared with existing feature selection algorithms, the experimental results on different real data sets show that the proposed algorithm is more effective for feature selection in incomplete data at most cases. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:210 / 220
页数:11
相关论文
共 50 条
  • [1] Active Feature Selection for the Mutual Information Criterion
    Schnapp, Shachar
    Sabato, Sivan
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9497 - 9504
  • [2] Bayesian treatment of incomplete discrete data applied to mutual information and feature selection
    Hutter, M
    Zaffalon, M
    [J]. KI 2003: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2003, 2821 : 396 - 406
  • [3] Hybrid Feature Selection: Combining Fisher Criterion and Mutual Information for Efficient Feature Selection
    Dhir, Chandra Shekhar
    Lee, Soo Young
    [J]. ADVANCES IN NEURO-INFORMATION PROCESSING, PT I, 2009, 5506 : 613 - 620
  • [4] On the Feature Selection Criterion Based on an Approximation of Multidimensional Mutual Information
    Balagani, Kiran S.
    Phoha, Vir V.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (07) : 1342 - 1343
  • [5] Mutual information criterion for feature selection with application to classification of breast microcalcifications
    Diamant, Idit
    Shalhon, Moran
    Goldberger, Jacob
    Greenspan, Hayit
    [J]. MEDICAL IMAGING 2016: IMAGE PROCESSING, 2016, 9784
  • [6] An Information Criterion for Auxiliary Variable Selection in Incomplete Data Analysis
    Imori, Shinpei
    Shimodaira, Hidetoshi
    [J]. ENTROPY, 2019, 21 (03):
  • [7] An Akaike information criterion for model selection in the presence of incomplete data
    Cavanaugh, JE
    Shumway, RH
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1998, 67 (01) : 45 - 65
  • [8] A New Approach for Feature Selection from Microarray Data Based on Mutual Information
    Tang, Jian
    Zhou, Shuigeng
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (06) : 1004 - 1015
  • [9] Application of the mutual information criterion for feature selection in computer-aided diagnosis
    Tourassi, GD
    Frederick, ED
    Markey, MK
    Floyd, CE
    [J]. MEDICAL PHYSICS, 2001, 28 (12) : 2394 - 2402
  • [10] Feature selection with missing data using mutual information estimators
    Doquire, Gauthier
    Verleysen, Michel
    [J]. NEUROCOMPUTING, 2012, 90 : 3 - 11