Establishing strong imputation performance of a denoising autoencoder in a wide range of missing data problems

被引:41
|
作者
Abiri, Najmeh [1 ]
Linse, Bjorn [1 ]
Eden, Patrik [1 ]
Ohlsson, Mattias [1 ,2 ]
机构
[1] Lund Univ, Dept Astron & Theoret Phys, Lund, Sweden
[2] Halmstad Univ, Ctr Appl Intelligent Syst Res, Halmstad, Sweden
关键词
Deep learning; Autoencoder; Imputation; Missing data; NETWORK;
D O I
10.1016/j.neucom.2019.07.065
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dealing with missing data in data analysis is inevitable. Although powerful imputation methods that address this problem exist, there is still much room for improvement. In this study, we examined single imputation based on deep autoencoders, motivated by the apparent success of deep learning to efficiently extract useful dataset features. We have developed a consistent framework for both training and imputation. Moreover, we benchmarked the results against state-of-the-art imputation methods on different data sizes and characteristics. The work was not limited to the one-type variable dataset; we also imputed missing data with multi-type variables, e.g., a combination of binary, categorical, and continuous attributes. To evaluate the imputation methods, we randomly corrupted the complete data, with varying degrees of corruption, and then compared the imputed and original values. In all experiments, the developed autoencoder obtained the smallest error for all ranges of initial data corruption. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:137 / 146
页数:10
相关论文
共 50 条
  • [1] Multivariate Time Series Missing Data Imputation Using Recurrent Denoising Autoencoder
    Zhang, Jianye
    Yin, Peng
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 760 - 764
  • [2] Missing data imputation using an iterative denoising autoencoder (IDAE) for dissolved gas analysis
    Seo, Boseong
    Shin, Jaekyung
    Kim, Taejin
    Youn, Byeng D.
    [J]. ELECTRIC POWER SYSTEMS RESEARCH, 2022, 212
  • [3] MISSING DATA IMPUTATION FOR HEALTH CARE BIG DATA USING DENOISING AUTOENCODER WITH GENERATIVE ADVERSARIAL NETWORK
    Zhang, Yinbing
    [J]. SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2024, 25 (05): : 3850 - 3857
  • [4] Denoising Autoencoder-Based Missing Value Imputation for Smart Meters
    Ryu, Seunghyoung
    Kim, Minsoo
    Kim, Hongseok
    [J]. IEEE ACCESS, 2020, 8 : 40656 - 40666
  • [5] Denoising Masked Autoencoder-Based Missing Imputation within Constrained Environments for Electric Load Data
    Jeong, Jaeik
    Ku, Tai-Yeon
    Park, Wan-Ki
    [J]. ENERGIES, 2023, 16 (24)
  • [6] Siamese Autoencoder Architecture for the Imputation of Data Missing Not at Random
    Pereira, Ricardo Cardoso
    Abreu, Pedro Henriques
    Rodrigues, Pedro Pereira
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2024, 78
  • [7] Masked Autoencoder Transformer for Missing Data Imputation of PISA
    Freire, Guilherme Mendonca
    Curi, Mariana
    [J]. ARTIFICIAL INTELLIGENCE IN EDUCATION: POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS, DOCTORAL CONSORTIUM AND BLUE SKY, AIED 2024, PT I, 2024, 2150 : 364 - 372
  • [8] MIDIA: exploring denoising autoencoders for missing data imputation
    Qian Ma
    Wang-Chien Lee
    Tao-Yang Fu
    Yu Gu
    Ge Yu
    [J]. Data Mining and Knowledge Discovery, 2020, 34 : 1859 - 1897
  • [9] MIDIA: exploring denoising autoencoders for missing data imputation
    Ma, Qian
    Lee, Wang-Chien
    Fu, Tao-Yang
    Gu, Yu
    Yu, Ge
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (06) : 1859 - 1897
  • [10] Imputation of Missing Values in Training Data using Variational Autoencoder
    Hong, Xuerui
    Hao, Shuang
    [J]. 2023 IEEE 39TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS, ICDEW, 2023, : 49 - 54