Feature selection using a sinusoidal sequence combined with mutual information

被引:6
|
作者
Yuan, Gaoteng [1 ]
Lu, Lu [2 ]
Zhou, Xiaofeng [1 ]
机构
[1] Hohai Univ, Coll Comp & Informat, Nanjing 211100, Peoples R China
[2] Lib Nanjing Forestry Univ, Nanjing 210037, Peoples R China
关键词
Feature selection; Mutual information; Sinusoidal sequence; High-dimensional data; SSMI algorithm; FEATURE-EXTRACTION; CLASSIFICATION; ALGORITHM; FILTER;
D O I
10.1016/j.engappai.2023.107168
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data classification is the most common task in machine learning, and feature selection is the key step in the classification task. Common feature selection methods mainly analyze the maximum correlation and minimum redundancy between feature factors and tags while ignoring the impact of the number of key features, which will inevitably lead to waste in subsequent classification training. To solve this problem, a feature selection algorithm (SSMI) based on the combination of sinusoidal sequences and mutual information is proposed. First, the mutual information between each feature and tag is calculated, and the interference information in high-dimensional data is removed according to the mutual information value. Second, a sine function is constructed, and sine ordering is carried out according to the mutual information value and feature mean value between different categories of the same feature. By adjusting the period and phase value of the sequence, the feature set with the largest difference is found, and the subset of key features is obtained. Finally, three machine learning classifiers (KNN, RF, SVM) are used to classify key feature subsets, and several feature selection algorithms (JMI, mRMR, CMIM, SFS, etc.) are compared to verify the advantages and disadvantages of different algorithms. Compared with other feature selection methods, the SSMI algorithm obtains the least number of key features, with an average reduction of 15 features. The average classification accuracy has been improved by 3% on the KNN classifier. On the HBV and SDHR datasets, the SSMI algorithm achieved classification accuracy of 81.26% and 83.12%, with sensitivity and specificity results of 76.28%, 87.39% and 68.14%, 86.11%, respectively. This shows that the SSMI algorithm can achieve higher classification accuracy with a smaller feature subset.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Mutual information for feature selection: estimation or counting?
    Nguyen H.B.
    Xue B.
    Andreae P.
    Evolutionary Intelligence, 2016, 9 (3) : 95 - 110
  • [42] Active Feature Selection for the Mutual Information Criterion
    Schnapp, Shachar
    Sabato, Sivan
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9497 - 9504
  • [43] PCA based on mutual information for feature selection
    Fan, X.-L. (fanxueli@mail.ioa.ac.cn), 1600, Northeast University (28):
  • [44] Feature Selection by Maximizing Part Mutual Information
    Gao, Wanfu
    Hu, Liang
    Zhang, Ping
    2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MACHINE LEARNING (SPML 2018), 2018, : 120 - 127
  • [45] Genetic algorithm for feature selection with mutual information
    Ge, Hong
    Hu, Tianliang
    2014 SEVENTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2014), VOL 1, 2014, : 116 - 119
  • [46] Feature Selection with Conditional Mutual Information Considering Feature Interaction
    Liang, Jun
    Hou, Liang
    Luan, Zhenhua
    Huang, Weiping
    SYMMETRY-BASEL, 2019, 11 (07):
  • [47] Mutual information inspired feature selection using kernel canonical correlation analysis
    Wang Y.
    Cang S.
    Yu H.
    Expert Systems with Applications: X, 2019, 4
  • [48] Unsupervised Feature Selection for Outlier Detection in Categorical Data using Mutual Information
    Suri, N. N. R. Ranga
    Murty, M. Narasimha
    Athithan, G.
    2012 12TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS), 2012, : 253 - 258
  • [49] Feature Selection for Fault Detection in Rolling Element Bearings Using Mutual Information
    Kappaganthu, Karthik
    Nataraj, C.
    JOURNAL OF VIBRATION AND ACOUSTICS-TRANSACTIONS OF THE ASME, 2011, 133 (06):
  • [50] Feature Selection for Multi-label Learning Using Mutual Information and GA
    Yu, Ying
    Wang, Yinglong
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, RSKT 2014, 2014, 8818 : 454 - 463