A Comparative Look at the Resilience of Discriminative and Generative Classifiers to Missing Data in Longitudinal Datasets

被引:0
|
作者
Pingi, Sharon Torao [1 ]
Abul Bashar, Md
Nayak, Richi
机构
[1] Queensland Univ Technol, Sch Comp Sci, Fac Sci, Brisbane, Qld 4000, Australia
来源
DATA MINING, AUSDM 2022 | 2022年 / 1741卷
关键词
GAN; Missing data; Longitudinal; Disciminative classifiers; Generative classifiers;
D O I
10.1007/978-981-19-8746-5_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Longitudinal datasets often suffer from the missing data problem caused by irregular sampling rates and drop-outs, which leads to sub-optimal classification performances. Given the breakthrough of deep generative models in data generation, this paper proposes a conditional Generative Adversarial Network (GAN) based longitudinal classifier (LoGAN) to address this problem in longitudinal datasets. LoGAN is evaluated against commonly used and state-of-art discriminative and generative classifiers. Comparative performance is presented showing the sensitivity of classifiers to data missingness, in both balanced and imbalanced datasets. Results show that the GAN-based models perform on par with other deep learning based models and perform comparatively better on a balanced dataset, while non-deep learning-based discriminative models, in particular the ensemble models, performed better when data was imbalanced. Specifically, F1 scores for LoGAN models were >= 80% for up to 20% of missing data rates in the temporal component of the dataset and >= 60% for missing rates from 40-100%. Non-deep generative models showed low performance with the introduction of missing data rates.
引用
收藏
页码:133 / 147
页数:15
相关论文
共 9 条
  • [1] MaWGAN: A Generative Adversarial Network to Create Synthetic Data from Datasets with Missing Data
    Poudevigne-Durance, Thomas
    Jones, Owen Dafydd
    Qin, Yipeng
    ELECTRONICS, 2022, 11 (06)
  • [2] A data-driven missing value imputation approach for longitudinal datasets
    Caio Ribeiro
    Alex A. Freitas
    Artificial Intelligence Review, 2021, 54 : 6277 - 6307
  • [3] A data-driven missing value imputation approach for longitudinal datasets
    Ribeiro, Caio
    Freitas, Alex A.
    ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (08) : 6277 - 6307
  • [4] Framework for Upscaling Missing Data in Electricity Consumption Datasets Using Generative Adversarial Networks
    Romero, Diana
    Alcaraz-Fraga, R.
    Escamilla-Ambrosio, Ponciano J.
    SMART CITIES, ICSC-CITIES 2023, 2024, 1938 : 189 - 202
  • [5] Comparative assessment of the robustness of missing data imputation through generative topographic mapping
    Olier, I
    Vellido, A
    COMPUTATIONAL INTELLIGENCE AND BIOINSPIRED SYSTEMS, PROCEEDINGS, 2005, 3512 : 787 - 794
  • [6] A novel missing data imputation approach based on clinical conditional Generative Adversarial Networks applied to EHR datasets
    Bernardini, Michele
    Doinychko, Anastasiia
    Romeo, Luca
    Frontoni, Emanuele
    Amini, Massih-Reza
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 163
  • [7] Data cleaning issues in class imbalanced datasets: instance selection and missing values imputation for one-class classifiers
    Wang, Zhenyuan
    Tsai, Chih-Fong
    Lin, Wei-Chao
    DATA TECHNOLOGIES AND APPLICATIONS, 2021, 55 (05) : 771 - 787
  • [8] Self-supervised bi-directional mapping generative adversarial network for arbitrary-time longitudinal interpolation of missing data
    Lin, Jie
    Wu, Dongdong
    Huang, Lipai
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 105
  • [9] Estimating treatment effects from longitudinal clinical trial data with missing values: comparative analyses using different methods
    Houck, PR
    Mazuradar, S
    Koru-Sengul, T
    Tang, G
    Mulsant, BH
    Pollock, BG
    Reynolds, CF
    PSYCHIATRY RESEARCH, 2004, 129 (02) : 209 - 215