Reviewing Autoencoders for Missing Data Imputation: Technical Trends, Applications and Outcomes

被引:0
|
作者
Pereira, Ricardo Cardoso [1 ]
Santos, Miriam Seoane [1 ,2 ]
Rodrigues, Pedro Pereira [3 ]
Abreu, Pedro Henriques [1 ]
机构
[1] Univ Coimbra, Dept Informat Engn, Ctr Informat & Syst Univ Coimbra CISUC, P-3030790 Coimbra, Portugal
[2] IPO Porto Res Ctr, P-4200072 Porto, Portugal
[3] Univ Porto, Fac Med MEDCIDS FMUP, Ctr Hlth Technol & Serv Res CINTESIS, P-4200319 Porto, Portugal
关键词
SURVIVAL PREDICTION; NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Missing data is a problem often found in real-world datasets and it can degrade the performance of most machine learning models. Several deep learning techniques have been used to address this issue, and one of them is the Autoencoder and its Denoising and Variational variants. These models are able to learn a representation of the data with missing values and generate plausible new ones to replace them. This study surveys the use of Autoencoders for the imputation of tabular data and considers 26 works published between 2014 and 2020. The analysis is mainly focused on discussing patterns and recommendations for the architecture, hyperparameters and training settings of the network, while providing a detailed discussion of the results obtained by Autoencoders when compared to other state-of-the-art methods, and of the data contexts where they have been applied. The conclusions include a set of recommendations for the technical settings of the network, and show that Denoising Autoencoders outperform their competitors, particularly the often used statistical methods.
引用
收藏
页码:1255 / 1285
页数:31
相关论文
共 50 条
  • [1] Reviewing autoencoders for missing data imputation: Technical trends, applications and outcomes
    Pereira, Ricardo Cardoso
    Santos, Miriam Seoane
    Rodrigues, Pedro Pereira
    Abreu, Pedro Henriques
    [J]. Journal of Artificial Intelligence Research, 2020, 69 : 1255 - 1285
  • [2] MIDIA: exploring denoising autoencoders for missing data imputation
    Qian Ma
    Wang-Chien Lee
    Tao-Yang Fu
    Yu Gu
    Ge Yu
    [J]. Data Mining and Knowledge Discovery, 2020, 34 : 1859 - 1897
  • [3] MIDIA: exploring denoising autoencoders for missing data imputation
    Ma, Qian
    Lee, Wang-Chien
    Fu, Tao-Yang
    Gu, Yu
    Yu, Ge
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (06) : 1859 - 1897
  • [4] Missing value imputation in food composition data with denoising autoencoders
    Gjorshoska, Ivana
    Eftimov, Tome
    Trajanov, Dimitar
    [J]. JOURNAL OF FOOD COMPOSITION AND ANALYSIS, 2022, 112
  • [5] Physiological Waveform Imputation of Missing Data using Convolutional Autoencoders
    Miller, Daniel
    Ward, Andrew
    Bambos, Nicholas
    Scheinker, David
    Shin, Andrew
    [J]. 2018 IEEE 20TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM), 2018,
  • [6] Missing Data Imputation via Denoising Autoencoders: The Untold Story
    Costa, Adriana Fonseca
    Santos, Miriam Seoane
    Soares, Jastin Pompeu
    Abreu, Pedro Henriques
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS XVII, IDA 2018, 2018, 11191 : 87 - 98
  • [7] Imputation of Missing Traffic Flow Data Using Denoising Autoencoders
    Jiang, Boyuan
    Siddiqi, Muhammad Danial
    Asadi, Reza
    Regan, Amelia
    [J]. 12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 84 - 91
  • [8] Variational Autoencoders for Missing Data Imputation with Application to a Simulated Milling Circuit
    McCoy, John T.
    Kroon, Steve
    Auret, Lidia
    [J]. IFAC PAPERSONLINE, 2018, 51 (21): : 141 - 146
  • [9] MISSING DATA IMPUTATION IN THE ELECTRONIC HEALTH RECORD USING DEEPLY LEARNED AUTOENCODERS
    Beaulieu-Jones, Brett K.
    Moore, Jason H.
    [J]. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2017, 2017, : 207 - 218
  • [10] Multiple Imputation of Missing Composite Outcomes in Longitudinal Data
    O’Keeffe A.G.
    Farewell D.M.
    Tom B.D.M.
    Farewell V.T.
    [J]. Statistics in Biosciences, 2016, 8 (2) : 310 - 332