A feature selection algorithm based on Hoeffding inequality and mutual information

被引:0
|
作者
Yin, Chunyong [1 ]
Feng, Lu [1 ]
Ma, Luyu [1 ]
Yin, Zhichao [2 ]
Wang, Jin [1 ]
机构
[1] School of Computer and Software, Jiangsu Key Laboratory of Meteorological Observation and Information Processing, Jiangsu Engineering Center of Network Monitoring Nanjing University of Information Science and Technology, Nanjing, China
[2] Nanjing No.1 Middle School, Nanjing, China
关键词
Classification (of information) - Data mining;
D O I
10.14257/ijsip.2015.8.11.39
中图分类号
学科分类号
摘要
With the rapid development of the Internet, the application of data mining in the Internet is becoming more and more extensive. However, the data source’s complex feature redundancy leads that data mining process becomes very inefficient and complex. So feature selection research is essential to make data mining more efficient and simple. In this paper, we propose a new way to measure the correlation degree of internal features of dataset which is a mutation of mutual information. Additionally we also introduce Hoeffding inequality as constraint of constructing algorithm. During the experiments, we use C4.5 classification algorithm as test algorithm and compare HSF with BIF(feature selection algorithm based on mutual information). Experiments results show that HSF performances better than BIF[1] in TP and FP rate, what’s more the feature subset obtained by HSF can significantly improve the TP, FP and memory usage of C4.5 classification algorithm. © 2015 SERSC.
引用
收藏
页码:433 / 444
相关论文
共 50 条
  • [1] A Feature Selection Algorithm of Dynamic Data-Stream Based on Hoeffding Inequality
    Yin, Zhichao
    Yin, Chunyong
    Feng, Lu
    2015 4TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION TECHNOLOGY AND SENSOR APPLICATION (AITS), 2015, : 92 - 95
  • [2] RESEARCH ON FEATURE SELECTION ALGORITHM BASED ON MUTUAL INFORMATION AND GENETIC ALGORITHM
    Tang, Pan-Shi
    Tang, Xiao-Long
    Tao, Zhong-Yu
    Li, Jian-Ping
    2014 11TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2014, : 403 - 406
  • [3] Research on Mutual Information Feature Selection Algorithm Based on Genetic Algorithm
    College of Computer Science and Technology, Changchun University of Science and Technology, Jilin, Changchun
    130022, China
    不详
    J. Comput., 6 (131-141): : 131 - 141
  • [4] FEATURE SELECTION ALGORITHM BASED ON CONDITIONAL DYNAMIC MUTUAL INFORMATION
    Wang Liping
    INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS, 2015, 8 (01): : 316 - 337
  • [5] Spam Feature Selection Based on the Improved Mutual Information Algorithm
    Liang Ting
    Yu Qingsong
    2012 FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION NETWORKING AND SECURITY (MINES 2012), 2012, : 67 - 70
  • [6] Genetic algorithm for feature selection with mutual information
    Ge, Hong
    Hu, Tianliang
    2014 SEVENTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2014), VOL 1, 2014, : 116 - 119
  • [7] A new feature selection algorithm based on Mutual Information with pairwise constraints
    Song Jing
    Yang Ming
    Ji Genlin
    Cai Wenbin
    2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 3, 2010, : 483 - 486
  • [8] Feature Selection Method Based on the Improved of Mutual Information and Genetic Algorithm
    Qiu Ye
    Liu Peiyu
    Yang Yuzhen
    2009 IEEE INTERNATIONAL SYMPOSIUM ON IT IN MEDICINE & EDUCATION, VOLS 1 AND 2, PROCEEDINGS, 2009, : 836 - 839
  • [9] Feature selection algorithm for text classification based on improved mutual information
    丛帅
    张积宾
    徐志明
    王宇颖
    Journal of Harbin Institute of Technology(New series), 2011, (03) : 144 - 148
  • [10] A Filter Feature Selection Algorithm Based on Mutual Information for Intrusion Detection
    Zhao, Fei
    Zhao, Jiyong
    Niu, Xinxin
    Luo, Shoushan
    Xin, Yang
    APPLIED SCIENCES-BASEL, 2018, 8 (09):