Feature redundancy term variation for mutual information-based feature selection

被引:39
|
作者
Gao, Wanfu [1 ,2 ,3 ]
Hu, Liang [1 ,2 ]
Zhang, Ping [1 ,2 ]
机构
[1] JiLin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China
[2] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun, Peoples R China
[3] Jilin Univ, Coll Chem, Changchun, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Machine learning; Feature selection; Information theory; Feature redundancy; RELEVANCE;
D O I
10.1007/s10489-019-01597-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection plays a critical role in many applications that are relevant to machine learning, image processing and gene expression analysis. Traditional feature selection methods intend to maximize feature dependency while minimizing feature redundancy. In previous information-theoretical-based feature selection methods, feature redundancy term is measured by the mutual information between a candidate feature and each already-selected feature or the interaction information among a candidate feature, each already-selected feature and the class. However, the larger values of the traditional feature redundancy term do not indicate the worse a candidate feature because a candidate feature can obtain large redundant information, meanwhile offering large new classification information. To address this issue, we design a new feature redundancy term that considers the relevancy between a candidate feature and the class given each already-selected feature, and a novel feature selection method named min-redundancy and max-dependency (MRMD) is proposed. To verify the effectiveness of our method, MRMD is compared to eight competitive methods on an artificial example and fifteen real-world data sets respectively. The experimental results show that our method achieves the best classification performance with respect to multiple evaluation criteria.
引用
收藏
页码:1272 / 1288
页数:17
相关论文
共 50 条
  • [21] Feature selection based on mutual information and redundancy-synergy coefficient
    杨胜
    顾钧
    [J]. Journal of Zhejiang University-Science A(Applied Physics & Engineering), 2004, (11) : 71 - 80
  • [22] A Mutual Information-Based Hybrid Feature Selection Method for Software Cost Estimation Using Feature Clustering
    Shi, Shihai
    Liu, Qin
    [J]. INTERNATIONAL JOINT CONFERENCE ON APPLIED MATHEMATICS, STATISTICS AND PUBLIC ADMINISTRATION (AMSPA 2014), 2014, : 481 - 490
  • [23] Mutual Information-based multi-label feature selection using interaction information
    Lee, Jaesung
    Kim, Dae-Won
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (04) : 2013 - 2025
  • [24] Mutual information-based filter hybrid feature selection method for medical datasets using feature clustering
    Sadegh Asghari
    Hossein Nematzadeh
    Ebrahim Akbari
    Homayun Motameni
    [J]. Multimedia Tools and Applications, 2023, 82 : 42617 - 42639
  • [25] A Mutual Information-Based Hybrid Feature Selection Method for Software Cost Estimation Using Feature Clustering
    Liu, Qin
    Shi, Shihai
    Zhu, Hongming
    Xiao, Jiakai
    [J]. 2014 IEEE 38TH ANNUAL INTERNATIONAL COMPUTERS, SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), 2014, : 27 - 32
  • [26] Mutual information-based filter hybrid feature selection method for medical datasets using feature clustering
    Asghari, Sadegh
    Nematzadeh, Hossein
    Akbari, Ebrahim
    Motameni, Homayun
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (27) : 42617 - 42639
  • [27] Application of mutual information-based sequential feature selection to ISBSG mixed data
    Marta Fernández-Diego
    Fernando González-Ladrón-de-Guevara
    [J]. Software Quality Journal, 2018, 26 : 1299 - 1325
  • [28] Dynamic mutual information-based feature selection for multi-label learning
    Kim, Kyung-Jun
    Jun, Chi-Hyuck
    [J]. INTELLIGENT DATA ANALYSIS, 2023, 27 (04) : 891 - 909
  • [29] Mutual Information-based Feature Selection from Set-valued Data
    Shu, Wenhao
    Qian, Wenbin
    [J]. 2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 733 - 739
  • [30] Application of mutual information-based sequential feature selection to ISBSG mixed data
    Fernandez-Diego, Marta
    Gonzalez-Ladron-de-Guevara, Fernando
    [J]. SOFTWARE QUALITY JOURNAL, 2018, 26 (04) : 1299 - 1325