A feature selection algorithm based on Hoeffding inequality and mutual information

被引:0
|
作者
Yin, Chunyong [1 ]
Feng, Lu [1 ]
Ma, Luyu [1 ]
Yin, Zhichao [2 ]
Wang, Jin [1 ]
机构
[1] School of Computer and Software, Jiangsu Key Laboratory of Meteorological Observation and Information Processing, Jiangsu Engineering Center of Network Monitoring Nanjing University of Information Science and Technology, Nanjing, China
[2] Nanjing No.1 Middle School, Nanjing, China
关键词
Classification (of information) - Data mining;
D O I
10.14257/ijsip.2015.8.11.39
中图分类号
学科分类号
摘要
With the rapid development of the Internet, the application of data mining in the Internet is becoming more and more extensive. However, the data source’s complex feature redundancy leads that data mining process becomes very inefficient and complex. So feature selection research is essential to make data mining more efficient and simple. In this paper, we propose a new way to measure the correlation degree of internal features of dataset which is a mutation of mutual information. Additionally we also introduce Hoeffding inequality as constraint of constructing algorithm. During the experiments, we use C4.5 classification algorithm as test algorithm and compare HSF with BIF(feature selection algorithm based on mutual information). Experiments results show that HSF performances better than BIF[1] in TP and FP rate, what’s more the feature subset obtained by HSF can significantly improve the TP, FP and memory usage of C4.5 classification algorithm. © 2015 SERSC.
引用
收藏
页码:433 / 444
相关论文
共 50 条
  • [21] WJMI: A New Feature Selection Algorithm Based on Weighted Joint Mutual Information
    Qi Xiuli
    Yin Chengxiang
    Cheng Kai
    Liao Xianglin
    Kang Xingdang
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON MECHATRONICS AND INDUSTRIAL INFORMATICS, 2015, 31 : 632 - 638
  • [22] PCA based on mutual information for feature selection
    Fan, X.-L. (fanxueli@mail.ioa.ac.cn), 1600, Northeast University (28):
  • [23] A Feature Selection Algorithm Based on Equal Interval Division and Conditional Mutual Information
    Gu, Xiangyuan
    Guo, Jichang
    Ming, Tao
    Xiao, Lijun
    Li, Chongyi
    NEURAL PROCESSING LETTERS, 2022, 54 (03) : 2079 - 2105
  • [24] Feature Selection Algorithm for Dynamically Weighted Conditional Mutual Information
    Zhang Li
    Chen Xiaobo
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (10) : 3028 - 3034
  • [25] A new algorithm for EEG feature selection using mutual information
    Deriche, M
    Al-Ani, A
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 1057 - 1060
  • [26] A NOVEL FEATURE SELECTION ALGORITHM WITH SUPERVISED MUTUAL INFORMATION FOR CLASSIFICATION
    Palanichamy, Jaganathan
    Ramasamy, Kuppuchamy
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2013, 22 (04)
  • [27] A review of feature selection methods based on mutual information
    Jorge R. Vergara
    Pablo A. Estévez
    Neural Computing and Applications, 2014, 24 : 175 - 186
  • [28] Subset selection algorithm based on mutual information
    Huh, Moon Y.
    COMPSTAT 2006: PROCEEDINGS IN COMPUTATIONAL STATISTICS, 2006, : 461 - 470
  • [29] An Improved Feature Selection for Categorization Based on Mutual Information
    Liu, Haifeng
    Su, Zhan
    Yao, Zeqing
    Liu, Shousheng
    WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, 5854 : 80 - 87
  • [30] Feature selection using a mutual information based measure
    Al-Ani, A
    Deriche, M
    16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITON, VOL IV, PROCEEDINGS, 2002, : 82 - 85