A Hybrid Feature Selection Method for Data Sets of thousands of Variables

被引:9
|
作者
Liu, Jihong [1 ]
Wang, Guoxiong [2 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang, Peoples R China
[2] Liaoning Mil Reg, Polit Dept, Shenyang, Peoples R China
关键词
feature selection; Shepley value; mutual information; MUTUAL INFORMATION; RELEVANCE;
D O I
10.1109/ICACC.2010.5486671
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Feature selection has become the focus of research areas of applications with datasets of thousands of variables. In this study we present a hybrid feature selection (HFS) method that adopts both filter and wrapper models of feature subset selection. In the first stage of the feature selection, we use the filter model to rank the features by the mutual information (MI) between each feature and each class, and then choose k highest relevant features to the classes. In the second stage, we complete a wrapper model based feature selection algorithm, which uses Shepley value to evaluate the contribution of features to the classification task in a feature subset. Experimental results show obviously that the HFS method obtains better classification performance than solo Shepley value based or solo MI based feature selection method.
引用
收藏
页码:288 / 291
页数:4
相关论文
共 50 条
  • [1] A forecasting method with efficient selection of variables in multivariate data sets
    Sagar P.
    Gupta P.
    Kashyap I.
    [J]. International Journal of Information Technology, 2021, 13 (3) : 1039 - 1046
  • [2] A hybrid discretization and feature selection method based on rough sets for evaluation
    Bao, YK
    Lu, YS
    Sun, L
    Zhang, JL
    [J]. PROCEEDINGS OF THE EIGHTH IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, 2004, : 177 - 180
  • [3] A hybrid feature selection method for DNA microarray data
    Chuang, Li-Yeh
    Yang, Cheng-Huei
    Wu, Kuo-Chuan
    Yang, Cheng-Hong
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2011, 41 (04) : 228 - 237
  • [4] Feature selection in imbalance data sets
    Jamali, Ilnaz
    Bazmara, Mohammad
    Jafari, Shahram
    [J]. International Journal of Computer Science Issues, 2012, 9 (3 3-2): : 42 - 45
  • [5] A Novel Unsupervised Feature Selection Method for Bioinformatics Data Sets through Feature Clustering
    Li, Guangrong
    Hu, Xiaohua
    Shen, Xiajiong
    Chen, Xin
    Li, Zhoujun
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2008, : 41 - +
  • [6] Proposed Hybrid Attribute Selection Method on Financial Data Sets
    Yildirim, Mustafa
    Ozdemir, Suat
    [J]. 2019 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2019, : 429 - 434
  • [7] A novel feature selection method for large-scale data sets
    Chen, Wei-Chou
    Yang, Ming-Chun
    Tseng, Shian-Shyong
    [J]. INTELLIGENT DATA ANALYSIS, 2005, 9 (03) : 237 - 251
  • [8] A novel hybrid feature selection method for microarray data analysis
    Lee, Chien-Pang
    Leu, Yungho
    [J]. APPLIED SOFT COMPUTING, 2011, 11 (01) : 208 - 213
  • [9] Hybrid Feature Selection Method using Gene Expression Data
    Chuang, Li-Yeh
    Wu, Kuo-Chuan
    Yang, Cheng-Hong
    [J]. 2008 IEEE CONFERENCE ON SOFT COMPUTING IN INDUSTRIAL APPLICATIONS SMCIA/08, 2009, : 199 - +
  • [10] A Hybrid Feature Selection Method Using Gene Expression Data
    Chuang, Li-Yeh
    Wu, Kuo-Chuan
    Yang, Cheng-Hong
    [J]. 2009 9TH IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING, 2009, : 100 - +