Efficient Feature Selection in the Presence of Multiple Feature Classes

被引:3
|
作者
Dhillon, Paramveer S. [1 ]
Foster, Dean [2 ]
Ungar, Lyle H. [1 ]
机构
[1] Univ Penn, CIS, Philadelphia, PA 19104 USA
[2] Univ Penn, Stat, Philadelphia, PA 19104 USA
关键词
D O I
10.1109/ICDM.2008.56
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an information theoretic approach to feature selection when the data possesses feature classes. Feature classes are pervasive in real data. For example, in gene expression data, the genes which serve as features may be divided into classes based on their membership in gene families or pathways. When doing word sense disambiguation or named entity extraction, features fall into classes including adjacent words, their parts of speech, and the topic and venue of the document the word is in. When predictive features occur predominantly in a small number of feature classes, our information theoretic approach significantly improves feature selection. Experiments on real and synthetic data demonstrate substantial improvement in predictive accuracy over the standard L-0 penalty-based stepwise and streamwise feature selection methods as well as over Lasso and Elastic Nets, all of which are oblivious to the existence of feature classes.
引用
收藏
页码:779 / +
页数:2
相关论文
共 50 条
  • [1] Efficient feature selection in the presence of outliers and noises
    Yang, Shuang-Hong
    Hu, Bao-Gang
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 184 - 191
  • [2] A Wrapper Method for Feature Selection in Multiple Classes Datasets
    Sanchez-Marono, Noelia
    Alonso-Betanzos, Amparo
    Calvo-Estevez, Rosa M.
    BIO-INSPIRED SYSTEMS: COMPUTATIONAL AND AMBIENT INTELLIGENCE, PT 1, 2009, 5517 : 456 - 463
  • [3] On the Stability of Feature Selection in the Presence of Feature Correlations
    Sechidis, Konstantinos
    Papangelou, Konstantinos
    Nogueira, Sarah
    Weatherall, James
    Brown, Gavin
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT I, 2020, 11906 : 327 - 342
  • [4] Feature Construction and Feature Selection in Presence of Attribute Interactions
    Shafti, Leila S.
    Perez, Eduardo
    HYBRID ARTIFICIAL INTELLIGENCE SYSTEMS, 2009, 5572 : 589 - 596
  • [5] Active feature selection using classes
    Liu, H
    Yu, L
    Dash, M
    Motoda, H
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, 2003, 2637 : 474 - 485
  • [6] An efficient unsupervised feature selection procedure through feature clustering
    Yan, Xuyang
    Nazmi, Shabnam
    Erol, Berat A.
    Homaifar, Abdollah
    Gebru, Biniam
    Tunstel, Edward
    PATTERN RECOGNITION LETTERS, 2020, 131 : 277 - 284
  • [7] Hybrid-Recursive Feature Elimination for Efficient Feature Selection
    Jeon, Hyelynn
    Oh, Sejong
    APPLIED SCIENCES-BASEL, 2020, 10 (09):
  • [8] An Adaptive Multiple Feature Subset Method for Feature Ranking and Selection
    Chang, Fu
    Chen, Jen-Cheng
    INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI 2010), 2010, : 255 - 262
  • [9] Causal Feature Selection in the Presence of Sample Selection Bias
    Yang, Shuai
    Guo, Xianjie
    Yu, Kui
    Huang, Xiaoling
    Jiang, Tingting
    He, Jin
    Gu, Lichuan
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (05)
  • [10] Hybrid Feature Selection: Combining Fisher Criterion and Mutual Information for Efficient Feature Selection
    Dhir, Chandra Shekhar
    Lee, Soo Young
    ADVANCES IN NEURO-INFORMATION PROCESSING, PT I, 2009, 5506 : 613 - 620