Feature Selection for High-Dimensional Data Through Instance Vote Combining

被引:0
|
作者
Chamakura, Lily [1 ]
Saha, Goutam [1 ]
机构
[1] Indian Inst Technol Kharagpur, Kharagpur, W Bengal, India
关键词
Feature selection; Filter-based method; Set-covering problem; Instance voting; Graph modularity; Vote combining; CLASSIFICATION; PREDICTION; DISCOVERY; CANCER;
D O I
10.1145/3371158.3371177
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Supervised feature selection (FS) is used to select a discriminative and non-redundant subset of features in classification problems dealing with high dimensional inputs. In this paper, feature selection is posed akin to the set-covering problem where the goal is to select a subset of features such that they cover the instances. To solve this formulation, we quantify the local relevance (i.e., votes assigned by instances) of each feature that captures the extent to which a given feature is useful to classify the individual instances correctly. In this work, we propose to combine the instance votes across features to infer their joint local relevance. The votes are combined on the basis of geometric principles underlying classification and feature spaces. Further, we show how such instance vote combining may be employed to derive a heuristic search strategy for selecting a relevant and non-redundant subset of features. We illustrate the effectiveness of our approach by evaluating the classification performance and robustness to data variations on publicly available benchmark datasets. We observed that the proposed method outperforms state-of-the-art mutual information based FS techniques and performs comparably to other heuristic approaches that solve the set-covering formulation of feature selection.
引用
收藏
页码:161 / 169
页数:9
相关论文
共 50 条
  • [41] Feature selection using autoencoders with Bayesian methods to high-dimensional data
    Shu, Lei
    Huang, Kun
    Jiang, Wenhao
    Wu, Wenming
    Liu, Hongling
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (06) : 7397 - 7406
  • [42] The feature selection bias problem in relation to high-dimensional gene data
    Krawczuk, Jerzy
    Lukaszuk, Tomasz
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2016, 66 : 63 - 71
  • [43] Improving Evolutionary Algorithm Performance for Feature Selection in High-Dimensional Data
    Cilia, N.
    De Stefano, C.
    Fontanella, F.
    di Freca, A. Scotto
    [J]. APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2018, 2018, 10784 : 439 - 454
  • [44] A GA-based Feature Selection for High-dimensional Data Clustering
    Sun, Mei
    Xiong, Langhuan
    Sun, Haojun
    Jiang, Dazhi
    [J]. THIRD INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING, 2009, : 769 - 772
  • [45] A Cost-Sensitive Feature Selection Method for High-Dimensional Data
    An, Chaojie
    Zhou, Qifeng
    [J]. 14TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND EDUCATION (ICCSE 2019), 2019, : 1089 - 1094
  • [46] Online feature selection for high-dimensional class-imbalanced data
    Zhou, Peng
    Hu, Xuegang
    Li, Peipei
    Wu, Xindong
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 136 : 187 - 199
  • [47] Accurate and fast feature selection workflow for high-dimensional omics data
    Perez-Riverol, Yasset
    Kuhn, Max
    Vizcaino, Juan Antonio
    Hitz, Marc-Phillip
    Audain, Enrique
    [J]. PLOS ONE, 2017, 12 (12):
  • [48] Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data
    Yamada, Makoto
    Tang, Jiliang
    Lugo-Martinez, Jose
    Hodzic, Ermin
    Shrestha, Raunak
    Saha, Avishek
    Ouyang, Hua
    Yin, Dawei
    Mamitsuka, Hiroshi
    Sahinalp, Cenk
    Radivojac, Predrag
    Menczer, Filippo
    Chang, Yi
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (07) : 1352 - 1365
  • [49] Stability of feature selection in classification issues for high-dimensional correlated data
    Émeline Perthame
    Chloé Friguet
    David Causeur
    [J]. Statistics and Computing, 2016, 26 : 783 - 796
  • [50] Bird’s Eye View feature selection for high-dimensional data
    Samir Brahim Belhaouari
    Mohammed Bilal Shakeel
    Aiman Erbad
    Zarina Oflaz
    Khelil Kassoul
    [J]. Scientific Reports, 13