Feature Selection Algorithm for Noise Data

被引:0
|
作者
Xu, Hang [1 ]
Zhang, Shi-Chao [1 ]
Wu, Zhao-Jiang [1 ]
Li, Jia-Ye [2 ]
机构
[1] School of Computer Science and Engineering, Central South University, Changsha,410083, China
[2] School of Computer Science and Information Engineering, Guangxi Normal University, Guilin,541004, China
来源
Ruan Jian Xue Bao/Journal of Software | 2021年 / 32卷 / 11期
基金
中国国家自然科学基金;
关键词
Feature selection algorithm - Feature subspace - Features selection - Local structure - Locality preserving projections - Noise data - Noisy data - Regularisation - Sample space - Self-paced learning;
D O I
10.13328/j.cnki.jos.006041
中图分类号
学科分类号
摘要
The regularization feature selection algorithm is not effective in reducing the impact of noisy data. Moreover, the local structure of the sample space is hardly considered. After the samples are mapped to the feature subspace, the relationship between samples is inconsistent with the original space, resulting in unsatisfactory results of the data mining algorithm. This study proposes an anti-noise feature selection method that can effectively solve these two shortcomings of traditional algorithms. This method first uses a self-paced learning training method, which not only greatly reduces the possibility of outliers entering training, but also facilitates the rapid convergence of the model. Then, a regression learner with regular terms is used to select the embedded features, taking into account the sparse solution and solving over-fitting to make the model more robust. Finally, the technique of locality preserving projections is integrated, and its projection matrix is transformed into the regression parameter matrix of the model, while maintaining the original local structure between the samples while selecting the features. Some experiments are conducted for evaluating the algorithm with a series of benchmark data sets. Experimental results show the effectiveness of the proposed algorithm in term of the aCC and aRMSE. © Copyright 2021, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:3440 / 3451
相关论文
共 50 条
  • [1] An efficient feature selection algorithm for hybrid data
    Wang, Feng
    Liang, Jiye
    [J]. NEUROCOMPUTING, 2016, 193 : 33 - 41
  • [2] A Projected Feature Selection Algorithm for Data Classification
    Yin, Zhiwu
    Huang, Shangteng
    [J]. 2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 3665 - 3668
  • [3] A new Algorithm for Data Discretization and Feature Selection
    Ribeiro, Marcela Xavier
    Traina, Agma J. M.
    Traina, Caetano, Jr.
    [J]. APPLIED COMPUTING 2008, VOLS 1-3, 2008, : 953 - 954
  • [4] A hybrid feature selection algorithm for microarray data
    Zheng, Yuefeng
    Li, Ying
    Wang, Gang
    Chen, Yupeng
    Xu, Qian
    Fan, Jiahao
    Cui, Xueting
    [J]. JOURNAL OF SUPERCOMPUTING, 2020, 76 (05): : 3494 - 3526
  • [5] A hybrid feature selection algorithm for microarray data
    Yuefeng Zheng
    Ying Li
    Gang Wang
    Yupeng Chen
    Qian Xu
    Jiahao Fan
    Xueting Cui
    [J]. The Journal of Supercomputing, 2020, 76 : 3494 - 3526
  • [6] Genetic algorithm for feature selection of EEG heterogeneous data
    Saibene, Aurora
    Gasparini, Francesca
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 217
  • [7] Feature Selection Using Genetic Algorithm for Big Data
    Saidi, Rania
    Ncir, Waad Bouaguel
    Essoussi, Nadia
    [J]. INTERNATIONAL CONFERENCE ON ADVANCED MACHINE LEARNING TECHNOLOGIES AND APPLICATIONS (AMLTA2018), 2018, 723 : 352 - 361
  • [8] A New Feature Selection Algorithm for Stream Data Classification
    Wankhade, Kapil
    Rane, Dhiraj
    Thool, Ravindra
    [J]. 2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 1843 - 1848
  • [9] Hybrid genetic algorithm for feature selection with hyperspectral data
    Pal, Mahesh
    [J]. REMOTE SENSING LETTERS, 2013, 4 (07) : 619 - 628
  • [10] A Conservative Feature Subset Selection Algorithm with Missing Data
    Aussem, Alex
    de Morais, Sergio Rodrigues
    [J]. ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 725 - 730