Unsupervised data imputation with multiple importance sampling variational autoencoders

被引:0
|
作者
Kuang, Shenfen [1 ]
Huang, Yewen [2 ]
Song, Jie [1 ]
机构
[1] Shaoguan Univ, Sch Math & Stat, Shaoguan 512005, Peoples R China
[2] Guangdong Polytech Normal Univ, Sch Elect & Informat, Guangzhou 510665, Peoples R China
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
关键词
Missing data; Variational autoencoders; Multiple importance sampling; Resampling; MISSING DATA IMPUTATION;
D O I
10.1038/s41598-025-87641-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recently, deep latent variable models have made significant progress in dealing with missing data problems, benefiting from their ability to capture intricate and non-linear relationships within the data. In this work, we further investigate the potential of Variational Autoencoders (VAEs) in addressing the uncertainty associated with missing data via a multiple importance sampling strategy. We propose a Missing data Multiple Importance Sampling Variational Auto-Encoder (MMISVAE) method to effectively model incomplete data. Our approach consists of a learning step and an imputation step. During the learning step, the mixture components are represented by multiple separate encoder networks, which are later combined through simple averaging to enhance the latent representation capabilities of the VAEs when dealing with incomplete data. The statistical model and variational distributions are iteratively updated by maximizing the Multiple Importance Sampling Evidence Lower Bound (MISELBO) on the joint log-likelihood. In the imputation step, missing data is estimated using conditional expectation through multiple importance resampling. We propose an efficient imputation algorithm that broadens the scope of Missing data Importance Weighted Auto-Encoder (MIWAE) by incorporating multiple proposal probability distributions and the resampling schema. One notable characteristic of our method is the complete unsupervised nature of both the learning and imputation processes. Through comprehensive experimental analysis, we present evidence of the effectiveness of our method in improving the imputation accuracy of incomplete data when compared to current state-of-the-art VAEs-based methods.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Leveraging Variational Autoencoders for Multiple Data Imputation
    Roskams-Hieter, Breeshey
    Wells, Jude
    Wade, Sara
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT I, 2023, 14169 : 491 - 506
  • [2] Partial Multiple Imputation With Variational Autoencoders: Tackling Not at Randomness in Healthcare Data
    Pereira, Ricardo Cardoso
    Abreu, Pedro Henriques
    Rodrigues, Pedro Pereira
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (08) : 4218 - 4227
  • [3] Unsupervised Imputation of Non-Ignorably Missing Data Using Importance-Weighted Autoencoders
    Lim, David K.
    Rashid, Naim U.
    Oliva, Junier B.
    Ibrahim, Joseph G.
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2024,
  • [4] Data Augmentation with Variational Autoencoders and Manifold Sampling
    Chadebec, Clement
    Allassonniere, Stephanie
    DEEP GENERATIVE MODELS, AND DATA AUGMENTATION, LABELLING, AND IMPERFECTIONS, 2021, 13003 : 184 - 192
  • [5] Variational Autoencoders for Missing Data Imputation with Application to a Simulated Milling Circuit
    McCoy, John T.
    Kroon, Steve
    Auret, Lidia
    IFAC PAPERSONLINE, 2018, 51 (21): : 141 - 146
  • [6] Training Variational Autoencoders with Discrete Latent Variables Using Importance Sampling
    Bartler, Alexander
    Wiewel, Felix
    Mauch, Lukas
    Yang, Bin
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [7] Joint variational autoencoders for multimodal imputation and embedding
    Noah Cohen Kalafut
    Xiang Huang
    Daifeng Wang
    Nature Machine Intelligence, 2023, 5 : 631 - 642
  • [8] Joint variational autoencoders for multimodal imputation and embedding
    Kalafut, Noah Cohen
    Huang, Xiang
    Wang, Daifeng
    NATURE MACHINE INTELLIGENCE, 2023, 5 (06) : 631 - +
  • [9] Variational Autoencoding with Conditional Iterative Sampling for Missing Data Imputation
    Kuang, Shenfen
    Song, Jie
    Wang, Shangjiu
    Zhu, Huafeng
    MATHEMATICS, 2024, 12 (20)
  • [10] Multiple Imputation for Biomedical Data using Monte Carlo Dropout Autoencoders
    Miok, Kristian
    Dong Nguyen-Doan
    Robnik-Sikonja, Marko
    Zaharie, Daniela
    2019 E-HEALTH AND BIOENGINEERING CONFERENCE (EHB), 2019,