High-dimensional supervised feature selection via optimized kernel mutual information

被引:16
|
作者
Bi, Ning [1 ]
Tan, Jun [1 ]
Lai, Jian-Huang [2 ]
Suen, Ching Y. [3 ]
机构
[1] Sun Yat Sen Univ, Sch Math, Guangzhou 510275, Guangdong, Peoples R China
[2] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510275, Guangdong, Peoples R China
[3] Concordia Univ, Ctr Pattern Recognit & Machine Intelligence, Montreal, PQ H3G 1M8, Canada
基金
中国国家自然科学基金;
关键词
Feature selection; Kernel method; Mutual information; Classification; Optimize function; Machine learning; UNSUPERVISED FEATURE-SELECTION; SEARCH;
D O I
10.1016/j.eswa.2018.04.037
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is very important for pattern recognition to reduce the dimensions of data and to improve the efficiency of learning algorithms. Recent research on new approaches has focused mostly on improving accuracy and reducing computing time. This paper presents a flexible feature-selection method based on an optimized kernel mutual information (OKMI) approach. Mutual information (MI) has been applied successfully in decision trees to rank variables; its aim is to connect class labels with the distribution of experimental data. The use of MI removes irrelevant features and decreases redundant features. However, MI is usually less robust when the data distribution is not centralized. To overcome this problem, we propose to use the OKMI approach, which combines MI and a kernel function. This approach may be used for feature selection with nonlinear models by defining kernels for feature vectors and class-label vectors. By optimizing the objection equations, we develop a new feature-selection algorithm that combines both Ml and kernel learning, we discuss the relationship among various kernel-selection methods. Experiments were conducted to compare the new technique applied to various data sets with other methods, and in each case the OKMI approach performs better than the other methods in terms of feature-classification accuracy and computing time. OKMI method solves the problem of computation complexity in the probability of distribution, and avoids this problem by finding the optimal features at very low computational cost. As a result, the OKMI method with the proposed algorithm is effective and robust over a wide range of real applications on expert systems. Crown Copyright (C) 2018 Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:81 / 95
页数:15
相关论文
共 50 条
  • [1] Feature selection, mutual information, and the classification of high-dimensional patterns
    Bonev, Boyan
    Escolano, Francisco
    Cazorla, Miguel
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2008, 11 (3-4) : 309 - 319
  • [2] A Feature Subset Selection Method Based On High-Dimensional Mutual Information
    Zheng, Yun
    Kwoh, Chee Keong
    [J]. ENTROPY, 2011, 13 (04) : 860 - 901
  • [3] Feature Selection using Mutual Information for High-dimensional Data Sets
    Nagpal, Arpita
    Gaur, Deepti
    Gaur, Seema
    [J]. SOUVENIR OF THE 2014 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2014, : 45 - 49
  • [4] On feature selection for supervised learning problems involving high-dimensional analytical information
    Zuvela, P.
    Liu, J. Jay
    [J]. RSC ADVANCES, 2016, 6 (86) : 82801 - 82809
  • [5] Semi-supervised Feature Selection by Mutual Information Based on Kernel Density Estimation
    Xu, Siqi
    Dai, Jianhua
    Shi, Hong
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 818 - 823
  • [6] Clustering high-dimensional data via feature selection
    Liu, Tianqi
    Lu, Yu
    Zhu, Biqing
    Zhao, Hongyu
    [J]. BIOMETRICS, 2023, 79 (02) : 940 - 950
  • [7] Extremely High-Dimensional Feature Selection via Feature Generating Samplings
    Li, Shutao
    Wei, Dan
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (06) : 737 - 747
  • [8] Feature selection, mutual information, and the classification of high-dimensional patternsApplications to image classification and microarray data analysis
    Boyan Bonev
    Francisco Escolano
    Miguel Cazorla
    [J]. Pattern Analysis and Applications, 2008, 11 : 309 - 319
  • [9] Band Selection for High-Dimensional Remote Sensing Data by Mutual Information
    Banit'ouagua, Ibtissam
    Kerroum, Mounir Ait
    Hammouch, Ahmed
    Aboutajdine, Driss
    [J]. 2016 INTERNATIONAL CONFERENCE ON ELECTRICAL AND INFORMATION TECHNOLOGIES (ICEIT), 2016, : 386 - 391
  • [10] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    [J]. Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75