Random feature subset selection for analysis of data with missing features

被引:0
|
作者
DePasquale, Joseph [1 ]
Polikar, Robi [1 ]
机构
[1] Rowan Univ, Signal Processing & Pattern Recognit Lab, Glassboro, NJ 08028 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We discuss an ensemble-of-classifiers based algorithm for the missing feature problem. The proposed approach is inspired in part by the random subspace method, and in part by the incremental learning algorithm, Learn(++). The premise is to generate an adequately large number of classifiers, each trained on a different and random combination of features, drawn from an iteratively updated distribution. To classify an instance with missing features, only those classifiers whose training data did not include the currently missing feature are used. These classifiers are combined by using a majority voting combination rule to obtain the final classification of the given instance. We had previously presented preliminary results on a similar approach, which could handle up to 10% missing data. In this study, we expand our work to include different types of rules to update the distribution, and also examine the effect of the algorithm's primary free parameter (the number of features used to train the ensemble of classifiers) on the overall classification performance. We show that this algorithm can now accommodate up to 30% of features missing without a significant drop in performance.
引用
收藏
页码:2378 / 2383
页数:6
相关论文
共 50 条
  • [1] Random feature subset selection for ensemble based classification of data with missing features
    DePasquale, Joseph
    Polikar, Robi
    [J]. MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, 2007, 4472 : 251 - +
  • [2] A Conservative Feature Subset Selection Algorithm with Missing Data
    Aussem, Alex
    de Morais, Sergio Rodrigues
    [J]. ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 725 - 730
  • [3] A conservative feature subset selection algorithm with missing data
    Aussem, Alex
    de Morais, Sergio Rodrigues
    [J]. NEUROCOMPUTING, 2010, 73 (4-6) : 585 - 590
  • [4] Feature subset selection for data and feature streams: a review
    Carlos Villa-Blanco
    Concha Bielza
    Pedro Larrañaga
    [J]. Artificial Intelligence Review, 2023, 56 : 1011 - 1062
  • [5] Feature subset selection for data and feature streams: a review
    Villa-Blanco, Carlos
    Bielza, Concha
    Larranaga, Pedro
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL 1) : 1011 - 1062
  • [6] Random Subset Feature Selection and Classification of Lung Sound
    Don, S.
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 : 313 - 322
  • [7] Random approximated greedy search for feature subset selection
    Gao, F
    Ho, YC
    [J]. ASIAN JOURNAL OF CONTROL, 2004, 6 (03) : 439 - 446
  • [8] Biases in feature selection with missing data
    Seijo-Pardo, Borja
    Alonso-Betanzos, Amparo
    Bennett, Kristin P.
    Bolon-Canedo, Veronica
    Josse, Julie
    Saeed, Mehreen
    Guyon, Isabelle
    [J]. NEUROCOMPUTING, 2019, 342 : 97 - 112
  • [9] Causal Feature Selection with Missing Data
    Yu, Kui
    Yang, Yajing
    Ding, Wei
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (04)
  • [10] Classification Performance Improvement Using Random Subset Feature Selection Algorithm for Data Mining
    Lakshmipadmaja, D.
    Vishnuvardhan, B.
    [J]. BIG DATA RESEARCH, 2018, 12 : 1 - 12