Stable feature selection based on instance learning, redundancy elimination and efficient subsets fusion

被引:0
|
作者
Afef Ben Brahim
机构
[1] Université de Tunis,Tunis Business School, LARODEC
来源
Neural Computing and Applications | 2021年 / 33卷
关键词
Feature selection; High dimensionality; Instance-based learning; Stability;
D O I
暂无
中图分类号
学科分类号
摘要
Feature selection is frequently used as a preprocessing step to data mining and is attracting growing attention due to the increasing amounts of data emerging from different domains. The large data dimensionality increases the noise and thus the error of learning algorithms. Filter methods for feature selection are specially very fast and useful for high-dimensional datasets. Existing methods focus on producing feature subsets that improve predictive performance, but they often suffer from instability. Instance-based filters, for example, are considered as one of the most effective methods that rank features based on instances neighborhood. However, as the feature weight fluctuates with the instances, small changes in training data result in a different selected subset of features. By another hand, some other filters generate stable results but lead to a modest predictive performance. The absence of a trade-off between stability and classification accuracy decreases the reliability of the feature selection results. In order to deal with this issue, we propose filter methods that improve stability of feature selection while preserving an optimal predictive accuracy and without increasing the complexity of the feature selection algorithms. The proposed approaches first use the strength of instance learning to identify initial sets of relevant features, and the advantage of aggregation techniques to increase the stability of the final set in a second stage. Two classification algorithms are used to evaluate the predictive performance of our proposed instance-based filters compared to state-of-the-art algorithms. The obtained results show the efficiency of our methods in improving both classification accuracy and feature selection stability for high-dimensional datasets.
引用
收藏
页码:1221 / 1232
页数:11
相关论文
共 50 条
  • [31] Advances in Instance Selection for Instance-Based Learning Algorithms
    Henry Brighton
    Chris Mellish
    Data Mining and Knowledge Discovery, 2002, 6 : 153 - 172
  • [32] Advances in instance selection for instance-based learning algorithms
    Brighton, H
    Mellish, C
    DATA MINING AND KNOWLEDGE DISCOVERY, 2002, 6 (02) : 153 - 172
  • [33] Ensembles of instance selection methods based on feature subset
    Blachnik, Marcin
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 18TH ANNUAL CONFERENCE, KES-2014, 2014, 35 : 388 - 396
  • [34] Multi-instance learning based on representative instance and feature mapping
    Wang, Xingqi
    Wei, Dan
    Cheng, Hui
    Fang, Jinglong
    NEUROCOMPUTING, 2016, 216 : 790 - 796
  • [35] Efficient instance selection based on spatial abstraction
    Carbonera, Joel Luis
    Abel, Mara
    2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 286 - 292
  • [36] A Feature Selection Method Based on New Redundancy Measurement
    Li Z.-S.
    Lyu A.-N.
    Dongbei Daxue Xuebao/Journal of Northeastern University, 2020, 41 (11): : 1550 - 1556
  • [37] Semisupervised Feature Selection Based on Relevance and Redundancy Criteria
    Xu, Jin
    Tang, Bo
    He, Haibo
    Man, Hong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (09) : 1974 - 1984
  • [38] Feature Subset Selection based on Redundancy Maximized Clusters
    Tarek, Md Hasan
    Kadir, Md Eusha
    Sharmin, Sadia
    Sajib, Abu Ashfaqur
    Ali, Amin Ahsan
    Shoyaib, Mohammad
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 521 - 526
  • [39] Simple generative model for assessing feature selection based on relevance, redundancy, and redundancy
    Theiler, James
    APPLICATIONS OF MACHINE LEARNING, 2019, 11139
  • [40] Kernel fusion and feature selection in machine learning
    Mottl, V
    Krasotkina, O
    Seredin, O
    Muchnik, I
    Proceedings of the Eighth IASTED International Conference on Intelligent Systems and Control, 2005, : 477 - 482