Feature salience definition and estimation and its use in feature subset selection

被引:4
|
作者
Richards, G. [1 ]
Brazier, K. [1 ]
Wang, W. [1 ]
机构
[1] Univ E Anglia, Sch Comp Sci, Norwich NR4 7TJ, Norfolk, England
关键词
data mining; feature subset selection; feature salience; decision tree;
D O I
10.3233/IDA-2006-10102
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we describe novel feature subset selection methods, based on the estimation of feature salience i.e. the quantification of the relative importance of individual features, in the presence of other features, for determining the classes of records in a dataset. We present a definition of what we mean by feature salience and a method for estimating this feature salience. Five synthetic datasets were used to demonstrate the utility of the salience estimation technique. It was found that the estimation techniques produced good approximations to the calculated saliencies in most cases. The use of feature salience as the basis of three methods of feature subset selection is described. These methods were evaluated on real world data sets by constructing classifiers using all features and comparing these with classifiers constructed using only a selected subset of features. It was found that the results compared well with other state of the art techniques and that the methods were simpler to implement and significantly faster to execute. On average, applying our best feature subset selection method resulted in trees that used only 49% of the features used by trees constructed with the full set of features. This reduction in number of features used was associated with a 1% improvement in classifier accuracy.
引用
收藏
页码:3 / 21
页数:19
相关论文
共 50 条
  • [1] Feature subset selection using a new definition of classifiability
    Dong, M
    Kothari, R
    [J]. PATTERN RECOGNITION LETTERS, 2003, 24 (9-10) : 1215 - 1225
  • [2] A Formalism for Relevance and Its Application in Feature Subset Selection
    David A. Bell
    Hui Wang
    [J]. Machine Learning, 2000, 41 : 175 - 195
  • [3] A formalism for relevance and its application in feature subset selection
    Bell, DA
    Wang, H
    [J]. MACHINE LEARNING, 2000, 41 (02) : 175 - 195
  • [4] Feature transformation and subset selection
    Liu, H
    Motoda, H
    [J]. IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1998, 13 (02): : 26 - 28
  • [5] Feature selection with equalized salience measures and its application to segmentation
    Santos, Davi P.
    Neto, Joao Batista
    [J]. PROCEEDINGS OF THE XX BRAZILIAN SYMPOSIUM ON COMPUTER GRAPHICS AND IMAGE PROCESSING, 2007, : 253 - +
  • [6] Feature transformation and subset selection
    Natl Univ of Singapore, Singapore, Singapore
    [J]. IEEE Intell Syst their Appl, 2 (26-28):
  • [7] Wrappers for feature subset selection
    Kohavi, R
    John, GH
    [J]. ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) : 273 - 324
  • [8] Wrappers for feature subset selection
    Silicon Graphics, Inc, Mountain View, United States
    [J]. Artif Intell, 1-2 (273-324):
  • [9] THE FEATURE SUBSET SELECTION ALGORITHM
    Liu Yongguo Li Xueming Wu Zhongfu (Department of Computer Science and Engineering
    [J]. Journal of Electronics(China), 2003, (01) : 57 - 61
  • [10] Feature subset selection for data and feature streams: a review
    Carlos Villa-Blanco
    Concha Bielza
    Pedro Larrañaga
    [J]. Artificial Intelligence Review, 2023, 56 : 1011 - 1062