A feature selection algorithm based on Hoeffding inequality and mutual information

被引:0
|
作者
Yin, Chunyong [1 ]
Feng, Lu [1 ]
Ma, Luyu [1 ]
Yin, Zhichao [2 ]
Wang, Jin [1 ]
机构
[1] School of Computer and Software, Jiangsu Key Laboratory of Meteorological Observation and Information Processing, Jiangsu Engineering Center of Network Monitoring Nanjing University of Information Science and Technology, Nanjing, China
[2] Nanjing No.1 Middle School, Nanjing, China
关键词
Classification (of information) - Data mining;
D O I
10.14257/ijsip.2015.8.11.39
中图分类号
学科分类号
摘要
With the rapid development of the Internet, the application of data mining in the Internet is becoming more and more extensive. However, the data source’s complex feature redundancy leads that data mining process becomes very inefficient and complex. So feature selection research is essential to make data mining more efficient and simple. In this paper, we propose a new way to measure the correlation degree of internal features of dataset which is a mutation of mutual information. Additionally we also introduce Hoeffding inequality as constraint of constructing algorithm. During the experiments, we use C4.5 classification algorithm as test algorithm and compare HSF with BIF(feature selection algorithm based on mutual information). Experiments results show that HSF performances better than BIF[1] in TP and FP rate, what’s more the feature subset obtained by HSF can significantly improve the TP, FP and memory usage of C4.5 classification algorithm. © 2015 SERSC.
引用
收藏
页码:433 / 444
相关论文
共 50 条
  • [41] Feature selection based on weighted conditional mutual information
    Zhou, Hongfang
    Wang, Xiqian
    Zhang, Yao
    APPLIED COMPUTING AND INFORMATICS, 2024, 20 (1/2) : 55 - 68
  • [42] Feature Selection Based on Mutual Information for Language Recognition
    Deng, Yan
    Liu, Jia
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 4319 - 4322
  • [43] Feature Selection by Computing Mutual Information Based on Partitions
    Yin, Chengxiang
    Zhang, Hongjun
    Zhang, Rui
    Zeng, Zilin
    Qi, Xiuli
    Feng, Yuntian
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (02): : 437 - 446
  • [44] A SURVEY FOR STUDY OF FEATURE SELECTION BASED ON MUTUAL INFORMATION
    Su, Xiangchenyang
    Liu, Fang
    2018 9TH WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2018,
  • [45] A filter approach to feature selection based on mutual information
    Huang, Jinjie
    Cai, Yunze
    Xu, Xiaoming
    PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 84 - 89
  • [46] An Improved Feature Selection Algorithm with Conditional Mutual Information for Classification Problems
    Palanichamy, Jaganathan
    Ramasamy, Kuppuchamy
    2013 INTERNATIONAL CONFERENCE ON HUMAN COMPUTER INTERACTIONS (ICHCI), 2013,
  • [47] Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selection based on Mutual Information
    Alzubaidi, Abeer
    Cosma, Georgina
    Brown, David
    Pockley, A. Graham
    2016 9TH INTERNATIONAL CONFERENCE ON INTERACTIVE TECHNOLOGIES AND GAMES (ITAG), 2016, : 70 - 76
  • [48] Conditional mutual information-based feature selection algorithm for maximal relevance minimal redundancy
    Gu, Xiangyuan
    Guo, Jichang
    Xiao, Lijun
    Li, Chongyi
    APPLIED INTELLIGENCE, 2022, 52 (02) : 1436 - 1447
  • [50] Genetic Algorithm for the Mutual Information-Based Feature Selection in Univariate Time Series Data
    Siddiqi, Umair F.
    Sait, Sadiq M.
    Kaynak, Okyay
    IEEE ACCESS, 2020, 8 (08): : 9597 - 9609