A Supervised Filter Feature Selection method for mixed data based on Spectral Feature Selection and Information-theory redundancy analysis

被引:24
|
作者
Solorio-Fernandez, Saul [1 ]
Fco Martinez-Trinidad, Jose [1 ]
Ariel Carrasco-Ochoa, J. [1 ]
机构
[1] Inst Nacl Astrofis Opt & Electr, Comp Sci Dept, Luis Enrique Erro 1, Puebla 72840, Mexico
关键词
Supervised feature selection; Mixed data; Filter feature subset selection; Redundancy analysis; EFFICIENT FEATURE-SELECTION; MUTUAL INFORMATION; ALGORITHM; RELEVANCE;
D O I
10.1016/j.patrec.2020.07.039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spectral analysis and Information-theory are two powerful and successful frameworks for feature selection in supervised classification problems. However, most of the methods developed under these frameworks have been introduced for handling exclusively numerical or non- numerical data. In this paper, we propose a supervised filter feature selection method that combines Spectral Feature Selection and Information-theory based redundancy analysis for selecting relevant and non-redundant features in supervised mixed datasets; i.e., datasets where the objects are described simultaneously by both, numerical and non-numerical features. To demonstrate the effectiveness of our proposed supervised filter feature selection method, we conducted several experiments on 40 public real-world datasets. Additionally, we compare our method against relevant state-of-the-art supervised filter methods for numerical, nonnumerical, and mixed data. From this comparison, our method, in general, obtains better results than the results obtained by the other evaluated filter feature selection methods. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:321 / 328
页数:8
相关论文
共 50 条
  • [21] Feature selection based on information theory filters
    Duch, W
    Biesiada, J
    Winiarski, T
    Grudzinski, K
    Grabczewski, K
    [J]. NEURAL NETWORKS AND SOFT COMPUTING, 2003, : 173 - 178
  • [22] A Feature Selection Framework Based on Supervised Data Clustering
    Liu, Hongzhi
    Fu, Bin
    Jiang, Zhengshen
    Wu, Zhonghai
    Hsu, D. Frank
    [J]. 2016 IEEE 15TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), 2016, : 316 - 321
  • [23] Semi-supervised Feature Selection via Spectral Analysis
    Zhao, Zheng
    Liu, Huan
    [J]. PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 641 - 646
  • [24] Information Theory-Based Feature Selection: Minimum Distribution Similarity with Removed Redundancy
    Zhang, Yu
    Lin, Zhuoyi
    Kwoh, Chee Keong
    [J]. COMPUTATIONAL SCIENCE - ICCS 2020, PT V, 2020, 12141 : 3 - 17
  • [25] A Clustering Based Feature Selection Method Using Feature Information Distance for Text Data
    Chao, Shilong
    Cai, Jie
    Yang, Sheng
    Wang, Shulin
    [J]. INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2016, PT I, 2016, 9771 : 122 - 132
  • [26] Multi-label feature selection based on minimizing feature redundancy of mutual information
    Zhou, Gaozhi
    Li, Runxin
    Shang, Zhenhong
    Li, Xiaowu
    Jia, Lianyin
    [J]. NEUROCOMPUTING, 2024, 607
  • [27] A filter approach to feature selection based on mutual information
    Huang, Jinjie
    Cai, Yunze
    Xu, Xiaoming
    [J]. PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 84 - 89
  • [28] A Filter Feature Selection Method Based on MFA Score and Redundancy Excluding and It’s Application to Tumor Gene Expression Data Analysis
    Jiangeng Li
    Lei Su
    Zenan Pang
    [J]. Interdisciplinary Sciences: Computational Life Sciences, 2015, 7 : 391 - 396
  • [29] A Filter Feature Selection Method Based on MFA Score and Redundancy Excluding and It's Application to Tumor Gene Expression Data Analysis
    Li, Jiangeng
    Su, Lei
    Pang, Zenan
    [J]. INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2015, 7 (04) : 391 - 396
  • [30] Application of information theory to feature selection in hematology data
    Lu, Jiuliu
    Ramirez, Carlos
    [J]. INTERNATIONAL JOURNAL OF LABORATORY HEMATOLOGY, 2007, 29 : 56 - 56