MIVAE: Multiple Imputation based on Variational Auto-Encoder

被引:6
|
作者
Ma, Qian [1 ]
Li, Xia [1 ]
Bai, Mei [1 ]
Wang, Xite [1 ]
Ning, Bo [1 ]
Li, Guanyu [1 ]
机构
[1] Dalian Maritime Univ, Sch Informat Sci & Technol, Dalian 116026, Peoples R China
基金
中国国家自然科学基金;
关键词
Missing value; Multiple imputation; Variational Auto-Encoder; Data quality; MISSING DATA; INFERENCE;
D O I
10.1016/j.engappai.2023.106270
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, the issue of MV imputation has become one of the research hotspots in the field of data quality, since the missing values (MVs) are prevalent in real-world datasets and bring challenges to advanced data analytics algorithms. To impute the MVs, most existing approaches directly derive one estimation for each MV, which is categorized as the single imputation (SI). However, the SI ignores the uncertainty of the MVs, and thereby usually derive unsatisfactory imputation results compared to the Multiple imputation (MI). To extract the uncertainty of the MVs, the MI algorithms derive multiple candidate estimations for each MV. Nevertheless, existing MI approaches are few due to the complicated data-handling process. Accordingly, in this paper, by exploring the Variational Auto-Encoder (VAE) model, we propose a new MI approach, namely MIVAE (Multiple Imputation based on Variational Auto-Encoder) to impute MVs for the tabular data. In MIVAE, we first add a corrupted input layer (where the synthetic MVs are introduced) adjacent to the original input layer to make the model capable of MV issue. Then, we obtain multiple rather than single candidate estimations for each data sample from the posterior distribution of the latent variables learned by our designed model. In such way, the multiple imputation is effectively implemented where the uncertainty of the MVs are extracted perfectly. Next, to obtain satisfactory imputation results, we add a data analysis layer at the end of the network to integrate multiple candidate estimations intelligently. Finally, the experimental results over four real-world datasets demonstrate that MIVAE achieves significantly higher imputation accuracy compared to existing solutions, and MIVAE are capable of handling both numerical and categorized tabular data. For example, the imputation accuracy based on MIVAE improves up to about 40% and 30% compared with PMM and MIWAE (which are the state-of-the-art MI approach) over the CropMapping dataset, respectively. Moreover, we train a MIVAE model over three datasets containing MVs, respectively. By leveraging the trained MIVAE, the classification performance over the imputed data is similar to that over the complete data.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Hamiltonian Variational Auto-Encoder
    Caterini, Anthony L.
    Doucet, Arnaud
    Sejdinovic, Dino
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [2] COMPOUND VARIATIONAL AUTO-ENCODER
    Su, Shang-Yu
    Lin, Shan-Wei
    Chen, Yun-Nung
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3577 - 3581
  • [3] Data imputation in IoT using Spatio-Temporal Variational Auto-Encoder
    Zhang, Shuo
    Chen, Jinyi
    Chen, Jiayuan
    Chen, Xiaofei
    Huang, Hejiao
    NEUROCOMPUTING, 2023, 529 : 23 - 32
  • [4] A METHOD FOR FACE FUSION BASED ON VARIATIONAL AUTO-ENCODER
    Li, Xiang
    Wen, Jin-Mei
    Chen, An-Long
    Chen, Bo
    2018 15TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2018, : 77 - 80
  • [5] Inpainting of Vintage Films Based on Variational Auto-encoder
    Li, Yuhang
    Ding, Youdong
    Yu, Bing
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 612 - 616
  • [6] Variational Auto-Encoder for text generation
    Hu, Haojin
    Liao, Mengfan
    Mao, Weiming
    Liu, Wei
    Zhang, Chao
    Jing, Yanmei
    PROCEEDINGS OF 2020 IEEE 5TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2020), 2020, : 595 - 598
  • [7] A trajectory outlier detection method based on variational auto-encoder
    Zhang, Longmei
    Lu, Wei
    Xue, Feng
    Chang, Yanshuo
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (08) : 15075 - 15093
  • [8] Anomaly detection method based on convolutional variational auto-encoder
    Yu X.
    Xu M.
    Wang Y.
    Wang S.
    Hu N.
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2021, 42 (05): : 151 - 158
  • [9] Detection Algorithm of the Mimicry Attack based on Variational Auto-Encoder
    Wang, Qunke
    Fang, Lanting
    Zhu, Zhenchao
    Huang, Jie
    51ST ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN-W 2021), 2021, : 114 - 120
  • [10] An unsupervised adversarial domain adaptation based on variational auto-encoder
    Mahta Hassan Pour Zonoozi
    Vahid Seydi
    Mahmood Deypir
    Machine Learning, 2025, 114 (5)