Deep Generative Imputation Model for Missing Not At Random Data

被引:0
|
作者
Chen, Jialei [1 ]
Xu, Yuanbo [1 ]
Wang, Pengyang [2 ]
Yang, Yongjian [1 ]
机构
[1] Jilin Univ, Dept Comp Sci & Technol, MIC Lab, Changchun, Peoples R China
[2] Univ Macau, Dept Comp & Informat Sci, SKL IOTSC, Macau, Peoples R China
关键词
Missing Data; Missing Not At Random; Imputation; Deep Generative Models; Variational Autoencoder;
D O I
10.1145/3583780.3614835
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data analysis usually suffers from the Missing Not At Random (MNAR) problem, where the cause of the value missing is not fully observed. Compared to the naive Missing Completely At Random (MCAR) problem, it is more in line with the realistic scenario whereas more complex and challenging. Existing statistical methods model the MNAR mechanism by different decomposition of the joint distribution of the complete data and the missing mask. But we empirically find that directly incorporating these statistical methods into deep generative models is sub-optimal. Specifically, it would neglect the confidence of the reconstructed mask during the MNAR imputation process, which leads to insufficient information extraction and less-guaranteed imputation quality. In this paper, we revisit the MNAR problem from a novel perspective that the complete data and missing mask are two modalities of incomplete data on an equal footing. Along with this line, we put forward a generative-model-specific joint probability decomposition method, conjunction model, to represent the distributions of two modalities in parallel and extract sufficient information from both complete data and missing mask. Taking a step further, we exploit a deep generative imputation model, namely GNR, to process the real-world missing mechanism in the latent space and concurrently impute the incomplete data and reconstruct the missing mask. The experimental results show that our GNR surpasses state-of-the-art MNAR baselines with significant margins (averagely improved from 9.9% to 18.8% in RMSE) and always gives a better mask reconstruction accuracy which makes the imputation more principle.
引用
收藏
页码:316 / 325
页数:10
相关论文
共 50 条
  • [1] Identifiable Generative Models for Missing Not at Random Data Imputation
    Ma, Chao
    Zhang, Cheng
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [2] Missing Data Imputation Method Combining Random Forest and Generative Adversarial Imputation Network
    Ou, Hongsen
    Yao, Yunan
    He, Yi
    [J]. SENSORS, 2024, 24 (04)
  • [3] Improved generative adversarial network with deep metric learning for missing data imputation
    Al-taezi, Mohammed Ali
    Wang, Yu
    Zhu, Pengfei
    Hu, Qinghua
    Al-badwi, Abdulrahman
    [J]. NEUROCOMPUTING, 2024, 570
  • [4] A systematic review of generative adversarial imputation network in missing data imputation
    Yuqing Zhang
    Runtong Zhang
    Butian Zhao
    [J]. Neural Computing and Applications, 2023, 35 : 19685 - 19705
  • [5] A systematic review of generative adversarial imputation network in missing data imputation
    Zhang, Yuqing
    Zhang, Runtong
    Zhao, Butian
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (27): : 19685 - 19705
  • [6] Improved generative adversarial imputation networks for missing data
    Qin, Xiwen
    Shi, Hongyu
    Dong, Xiaogang
    Zhang, Siqi
    Yuan, Liping
    [J]. APPLIED INTELLIGENCE, 2024, 54 (21) : 11068 - 11082
  • [7] Multiple imputation of ordinal missing not at random data
    Hammon, Angelina
    [J]. ASTA-ADVANCES IN STATISTICAL ANALYSIS, 2023, 107 (04) : 671 - 692
  • [8] Multiple imputation of ordinal missing not at random data
    Angelina Hammon
    [J]. AStA Advances in Statistical Analysis, 2023, 107 : 671 - 692
  • [9] GAGIN: generative adversarial guider imputation network for missing data
    Wei Wang
    Yimeng Chai
    Yue Li
    [J]. Neural Computing and Applications, 2022, 34 : 7597 - 7610
  • [10] GAIN: Missing Data Imputation using Generative Adversarial Nets
    Yoon, Jinsung
    Jordon, James
    van der Schaar, Mihaela
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80