Leveraging Variational Autoencoders for Multiple Data Imputation

被引:3
|
作者
Roskams-Hieter, Breeshey [1 ,2 ]
Wells, Jude [2 ,3 ]
Wade, Sara [1 ]
机构
[1] Univ Edinburgh, Edinburgh, Midlothian, Scotland
[2] Hlth Data Res UK, London, England
[3] UCL, London, England
基金
英国惠康基金;
关键词
VAEs; multiple imputation; MISSING-DATA;
D O I
10.1007/978-3-031-43412-9_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Missing data persists as a major barrier to data analysis across numerous applications. Recently, deep generative models have been used for imputation of missing data, motivated by their ability to learn complex and non-linear relationships. In this work, we investigate the ability of variational autoencoders (VAEs) to account for uncertainty in missing data through multiple imputation. We find that VAEs provide poor empirical coverage of missing data, with underestimation and overconfident imputations. To overcome this, we employ beta-VAEs, which viewed from a generalized Bayes framework, provide robustness to model misspecification. Assigning a good value of beta is critical for uncertainty calibration and we demonstrate how this can be achieved using cross-validation. We assess three alternative methods for sampling from the posterior distribution of missing values and apply the approach to transcriptomics datasets with various simulated missingness scenarios. Finally, we show that single imputation in transcriptomic data can cause false discoveries in downstream tasks and employing multiple imputation with beta-VAEs can effectively mitigate these inaccuracies.
引用
收藏
页码:491 / 506
页数:16
相关论文
共 50 条
  • [21] Physiological Waveform Imputation of Missing Data using Convolutional Autoencoders
    Miller, Daniel
    Ward, Andrew
    Bambos, Nicholas
    Scheinker, David
    Shin, Andrew
    2018 IEEE 20TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM), 2018,
  • [22] Missing value imputation in food composition data with denoising autoencoders
    Gjorshoska, Ivana
    Eftimov, Tome
    Trajanov, Dimitar
    JOURNAL OF FOOD COMPOSITION AND ANALYSIS, 2022, 112
  • [23] Missing Data Imputation via Denoising Autoencoders: The Untold Story
    Costa, Adriana Fonseca
    Santos, Miriam Seoane
    Soares, Jastin Pompeu
    Abreu, Pedro Henriques
    ADVANCES IN INTELLIGENT DATA ANALYSIS XVII, IDA 2018, 2018, 11191 : 87 - 98
  • [24] Imputation of Missing Traffic Flow Data Using Denoising Autoencoders
    Jiang, Boyuan
    Siddiqi, Muhammad Danial
    Asadi, Reza
    Regan, Amelia
    12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 84 - 91
  • [25] Variational autoencoders for 3D data processing
    Szilárd Molnár
    Levente Tamás
    Artificial Intelligence Review, 57
  • [26] Variational autoencoders for 3D data processing
    Molnar, Szilard
    Tamas, Levente
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (02)
  • [27] Variational autoencoders learn transferrable representations of metabolomics data
    Gomari, Daniel P.
    Schweickart, Annalise
    Cerchietti, Leandro
    Paietta, Elisabeth
    Fernandez, Hugo
    Al-Amin, Hassen
    Suhre, Karsten
    Krumsiek, Jan
    COMMUNICATIONS BIOLOGY, 2022, 5 (01)
  • [28] Variational autoencoders learn transferrable representations of metabolomics data
    Daniel P. Gomari
    Annalise Schweickart
    Leandro Cerchietti
    Elisabeth Paietta
    Hugo Fernandez
    Hassen Al-Amin
    Karsten Suhre
    Jan Krumsiek
    Communications Biology, 5
  • [29] Seismic labeled data expansion using variational autoencoders
    Li, Kunhong
    Chen, Song
    Hu, Guangmin
    ARTIFICIAL INTELLIGENCE IN GEOSCIENCES, 2020, 1 : 24 - 30
  • [30] A Generation of Enhanced Data by Variational Autoencoders and Diffusion Modeling
    Kim, Young-Jun
    Lee, Seok-Pil
    ELECTRONICS, 2024, 13 (07)