Efficient Feature Selection in the Presence of Multiple Feature Classes

被引:3
|
作者
Dhillon, Paramveer S. [1 ]
Foster, Dean [2 ]
Ungar, Lyle H. [1 ]
机构
[1] Univ Penn, CIS, Philadelphia, PA 19104 USA
[2] Univ Penn, Stat, Philadelphia, PA 19104 USA
关键词
D O I
10.1109/ICDM.2008.56
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an information theoretic approach to feature selection when the data possesses feature classes. Feature classes are pervasive in real data. For example, in gene expression data, the genes which serve as features may be divided into classes based on their membership in gene families or pathways. When doing word sense disambiguation or named entity extraction, features fall into classes including adjacent words, their parts of speech, and the topic and venue of the document the word is in. When predictive features occur predominantly in a small number of feature classes, our information theoretic approach significantly improves feature selection. Experiments on real and synthetic data demonstrate substantial improvement in predictive accuracy over the standard L-0 penalty-based stepwise and streamwise feature selection methods as well as over Lasso and Elastic Nets, all of which are oblivious to the existence of feature classes.
引用
收藏
页码:779 / +
页数:2
相关论文
共 50 条
  • [31] Feature Selection by Efficient Learning of Markov Blanket
    Fu, Shunkai
    Desmarais, Michel
    WORLD CONGRESS ON ENGINEERING, WCE 2010, VOL I, 2010, : 302 - 308
  • [32] CLASS OF COMPUTATIONALLY EFFICIENT FEATURE SELECTION CRITERIA
    CHEN, CH
    PATTERN RECOGNITION, 1975, 7 (1-2) : 87 - 94
  • [33] Efficient greedy feature selection for unsupervised learning
    Ahmed K. Farahat
    Ali Ghodsi
    Mohamed S. Kamel
    Knowledge and Information Systems, 2013, 35 : 285 - 310
  • [34] Efficient feature selection techniques for sentiment analysis
    Madasu, Avinash
    Elango, Sivasankar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (9-10) : 6313 - 6335
  • [35] Efficient feature selection using shrinkage estimators
    Konstantinos Sechidis
    Laura Azzimonti
    Adam Pocock
    Giorgio Corani
    James Weatherall
    Gavin Brown
    Machine Learning, 2019, 108 : 1261 - 1286
  • [36] An Efficient Fuzzy Rough Approach for Feature Selection
    Xu, Feifei
    Pan, Weiguo
    Wei, Lai
    Du, Haizhou
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2011, 6954 : 95 - +
  • [37] FEATURE SUBSET SELECTION FOR EFFICIENT ADABOOST TRAINING
    Sun, Chensheng
    Hu, Jiwei
    Lam, Kin-Man
    2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [38] Efficient Parallel Feature Selection for Steganography Problems
    Guillen, Alberto
    Sorjamaa, Antti
    Miche, Yoan
    Lendasse, Amaury
    Rojas, Ignacio
    BIO-INSPIRED SYSTEMS: COMPUTATIONAL AND AMBIENT INTELLIGENCE, PT 1, 2009, 5517 : 1224 - +
  • [39] An efficient feature selection method for object detection
    Le, DD
    Satoh, S
    PATTERN RECOGNITION AND DATA MINING, PT 1, PROCEEDINGS, 2005, 3686 : 461 - 468
  • [40] EFFICIENT OBJECT FEATURE SELECTION FOR ACTION RECOGNITION
    Zhang, Tianyi
    Zhang, Yu
    Cai, Jianfei
    Kot, Alex C.
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 2707 - 2711