The effect of missing data on learning classifier system learning rate and classification performance

被引:0
|
作者
Holmes, JH [1 ]
Bilker, WB [1 ]
机构
[1] Univ Penn, Sch Med, Ctr Clin Epidemiol & Biostat, Philadelphia, PA 19104 USA
来源
LEARNING CLASSIFIER SYSTEMS | 2002年 / 2661卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Missing data pose a potential threat to learning and classification in that they may compromise the ability of a system to develop robust, generalized models of the environment in which they operate. This investigation reports on the effects of the various types of missing data, present in varying densities in a group of simulated datasets, on learning classifier system performance. It was found that missing data have an adverse effect on learning classifier system (LCS) learning and classification performance, the latter of which is not seen in See5, a robust decision tree inducer. Specific adverse effects include decreased learning rate, decreased accuracy of classification of novel data on testing, increased proportions of testing cases that cannot be classified, and increased variability in these metrics. In addition, the effects are correlated with the density of missing values in a dataset, as well as the type of missing data, whether it is random and ignorable, or systematically missing and therefore non-ignorable.
引用
收藏
页码:46 / 60
页数:15
相关论文
共 50 条
  • [1] A Flexible Learning Classifier System for Classification and Data Mining in Genetic Epidemiology
    Urbanowicz, Ryan J.
    Moore, Jason H.
    [J]. GENETIC EPIDEMIOLOGY, 2012, 36 (07) : 752 - 752
  • [2] Classifier learning from difficult data on the example of missing features
    Porwik, Piotr
    Orczyk, Tomasz
    Doroz, Rafal
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [3] An Extended Michigan-Style Learning Classifier System for Flexible Supervised Learning, Classification, and Data Mining
    Urbanowicz, Ryan J.
    Bertasius, Gediminas
    Moore, Jason H.
    [J]. PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XIII, 2014, 8672 : 211 - 221
  • [4] TRANSFER LEARNING FOR NONPARAMETRIC CLASSIFICATION: MINIMAX RATE AND ADAPTIVE CLASSIFIER
    Cai, T. Tony
    Wei, Hongji
    [J]. ANNALS OF STATISTICS, 2021, 49 (01): : 100 - 128
  • [5] Learning a Credal Classifier With Optimized and Adaptive Multiestimation for Missing Data Imputation
    Zhang, Zuo-Wei
    Tian, Hong-Peng
    Yan, Ling-Zhi
    Martin, Arnaud
    Zhou, Kuang
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (07): : 4092 - 4104
  • [6] Improving the performance of the BioHEL learning classifier system
    Xia, Xiao-Lei
    Xing, Huanlai
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (15) : 6019 - 6032
  • [7] Robust on-line neural learning classifier system for data stream classification tasks
    Sancho-Asensio, Andreu
    Orriols-Puig, Albert
    Golobardes, Elisabet
    [J]. SOFT COMPUTING, 2014, 18 (08) : 1441 - 1461
  • [8] Robust on-line neural learning classifier system for data stream classification tasks
    Andreu Sancho-Asensio
    Albert Orriols-Puig
    Elisabet Golobardes
    [J]. Soft Computing, 2014, 18 : 1441 - 1461
  • [9] Learning with Missing Data
    Escobar, Carlos A.
    Arinez, Jorge
    Macias, Daniela
    Morales-Menendez, Ruben
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 5037 - 5045
  • [10] Classification of Heart Rate Data Using BFO-KFCM Clustering and Improved Extreme Learning Machine Classifier
    Kavitha, R.
    Christopher, T.
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI), 2016,