Efficient Feature Selection in the Presence of Multiple Feature Classes

被引:3
|
作者
Dhillon, Paramveer S. [1 ]
Foster, Dean [2 ]
Ungar, Lyle H. [1 ]
机构
[1] Univ Penn, CIS, Philadelphia, PA 19104 USA
[2] Univ Penn, Stat, Philadelphia, PA 19104 USA
关键词
D O I
10.1109/ICDM.2008.56
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an information theoretic approach to feature selection when the data possesses feature classes. Feature classes are pervasive in real data. For example, in gene expression data, the genes which serve as features may be divided into classes based on their membership in gene families or pathways. When doing word sense disambiguation or named entity extraction, features fall into classes including adjacent words, their parts of speech, and the topic and venue of the document the word is in. When predictive features occur predominantly in a small number of feature classes, our information theoretic approach significantly improves feature selection. Experiments on real and synthetic data demonstrate substantial improvement in predictive accuracy over the standard L-0 penalty-based stepwise and streamwise feature selection methods as well as over Lasso and Elastic Nets, all of which are oblivious to the existence of feature classes.
引用
收藏
页码:779 / +
页数:2
相关论文
共 50 条
  • [21] Nonplanarity and efficient multiple feature extraction
    Dickmanns, Ernst D.
    Wuensche, Hans-Joachim
    VISAPP 2006: PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 1, 2006, : 198 - +
  • [22] An efficient feature selection technique for clustering based on a new measure of feature importance
    Goswami, Saptarsi
    Chakrabarti, Amlan
    Chakraborty, Basabi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2017, 32 (06) : 3847 - 3858
  • [23] Efficient and accurate face detection using heterogeneous feature descriptors and feature selection
    Pan, Hong
    Zhu, Yaping
    Xia, Liangzheng
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2013, 117 (01) : 12 - 28
  • [24] An Efficient Approach for Feature Selection of SEMG Signal
    Liang Qi
    Ye Ming
    Ma Wenjie
    PROCEEDINGS OF THE 2008 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN, VOL 2, 2008, : 134 - 137
  • [25] An efficient feature selection algorithm for hybrid data
    Wang, Feng
    Liang, Jiye
    NEUROCOMPUTING, 2016, 193 : 33 - 41
  • [26] Efficient feature selection techniques for sentiment analysis
    Avinash Madasu
    Sivasankar Elango
    Multimedia Tools and Applications, 2020, 79 : 6313 - 6335
  • [27] Efficient Method for Feature Selection in Text Classification
    Sun, Jian
    Zhang, Xiang
    Liao, Dan
    Chang, Victor
    2017 INTERNATIONAL CONFERENCE ON ENGINEERING AND TECHNOLOGY (ICET), 2017,
  • [28] Efficient Online Learning for Multitask Feature Selection
    Yang, Haiqin
    Lyu, Michael R.
    King, Irwin
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2013, 7 (02)
  • [29] Efficient Feature Selection and Classification for Vehicle Detection
    Wen, Xuezhi
    Shao, Ling
    Fang, Wei
    Xue, Yu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2015, 25 (03) : 508 - 517
  • [30] SOAP: Efficient feature selection of numeric attributes
    Ruiz, R
    Aguilar-Ruiz, JS
    Riquelme, JC
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2002, PROCEEDINGS, 2002, 2527 : 233 - 242