Unsupervised data imputation with multiple importance sampling variational autoencoders

被引:0
|
作者
Kuang, Shenfen [1 ]
Huang, Yewen [2 ]
Song, Jie [1 ]
机构
[1] Shaoguan Univ, Sch Math & Stat, Shaoguan 512005, Peoples R China
[2] Guangdong Polytech Normal Univ, Sch Elect & Informat, Guangzhou 510665, Peoples R China
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
关键词
Missing data; Variational autoencoders; Multiple importance sampling; Resampling; MISSING DATA IMPUTATION;
D O I
10.1038/s41598-025-87641-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recently, deep latent variable models have made significant progress in dealing with missing data problems, benefiting from their ability to capture intricate and non-linear relationships within the data. In this work, we further investigate the potential of Variational Autoencoders (VAEs) in addressing the uncertainty associated with missing data via a multiple importance sampling strategy. We propose a Missing data Multiple Importance Sampling Variational Auto-Encoder (MMISVAE) method to effectively model incomplete data. Our approach consists of a learning step and an imputation step. During the learning step, the mixture components are represented by multiple separate encoder networks, which are later combined through simple averaging to enhance the latent representation capabilities of the VAEs when dealing with incomplete data. The statistical model and variational distributions are iteratively updated by maximizing the Multiple Importance Sampling Evidence Lower Bound (MISELBO) on the joint log-likelihood. In the imputation step, missing data is estimated using conditional expectation through multiple importance resampling. We propose an efficient imputation algorithm that broadens the scope of Missing data Importance Weighted Auto-Encoder (MIWAE) by incorporating multiple proposal probability distributions and the resampling schema. One notable characteristic of our method is the complete unsupervised nature of both the learning and imputation processes. Through comprehensive experimental analysis, we present evidence of the effectiveness of our method in improving the imputation accuracy of incomplete data when compared to current state-of-the-art VAEs-based methods.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Unsupervised pathology detection in medical images using conditional variational autoencoders
    Uzunova, Hristina
    Schultz, Sandra
    Handels, Heinz
    Ehrhardt, Jan
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2019, 14 (03) : 451 - 461
  • [32] Medical Data Wrangling With Sequential Variational Autoencoders
    Barrejon, Daniel
    Olmos, Pablo M.
    Artes-Rodriguez, Antonio
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (06) : 2737 - 2745
  • [33] Variational Autoencoders for Data Augmentation in Clinical Studies
    Papadopoulos, Dimitris
    Karalis, Vangelis D.
    APPLIED SCIENCES-BASEL, 2023, 13 (15):
  • [34] Variational Autoencoders for Sparse and Overdispersed Discrete Data
    Zhao, He
    Rai, Piyush
    Du, Lan
    Buntine, Wray
    Phung, Dinh
    Zhou, Mingyuan
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 1684 - 1693
  • [35] Empirical Evaluation of Variational Autoencoders for Data Augmentation
    Jorge, Javier
    Vieco, Jesus
    Paredes, Roberto
    Andreu Sanchez, Joan
    Miguel Benedi, Jose
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2018), VOL 5: VISAPP, 2018, : 96 - 104
  • [36] Benchmarking variational AutoEncoders on cancer transcriptomics data
    Eltager, Mostafa
    Abdelaal, Tamim
    Charrout, Mohammed
    Mahfouz, Ahmed
    Reinders, Marcel J. T.
    Makrodimitris, Stavros
    PLOS ONE, 2023, 18 (10):
  • [37] Posterior Consistency for Missing Data in Variational Autoencoders
    Sudak, Timur
    Tschiatschek, Sebastian
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT II, 2023, 14170 : 508 - 524
  • [38] Deterministic Decoding for Discrete Data in Variational Autoencoders
    Polykovskiy, Daniil
    Vetrov, Dmitry
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 3046 - 3055
  • [39] Physiological Waveform Imputation of Missing Data using Convolutional Autoencoders
    Miller, Daniel
    Ward, Andrew
    Bambos, Nicholas
    Scheinker, David
    Shin, Andrew
    2018 IEEE 20TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM), 2018,
  • [40] Missing value imputation in food composition data with denoising autoencoders
    Gjorshoska, Ivana
    Eftimov, Tome
    Trajanov, Dimitar
    JOURNAL OF FOOD COMPOSITION AND ANALYSIS, 2022, 112