Reviewing Autoencoders for Missing Data Imputation: Technical Trends, Applications and Outcomes

被引:0
|
作者
Pereira, Ricardo Cardoso [1 ]
Santos, Miriam Seoane [1 ,2 ]
Rodrigues, Pedro Pereira [3 ]
Abreu, Pedro Henriques [1 ]
机构
[1] Univ Coimbra, Dept Informat Engn, Ctr Informat & Syst Univ Coimbra CISUC, P-3030790 Coimbra, Portugal
[2] IPO Porto Res Ctr, P-4200072 Porto, Portugal
[3] Univ Porto, Fac Med MEDCIDS FMUP, Ctr Hlth Technol & Serv Res CINTESIS, P-4200319 Porto, Portugal
关键词
SURVIVAL PREDICTION; NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Missing data is a problem often found in real-world datasets and it can degrade the performance of most machine learning models. Several deep learning techniques have been used to address this issue, and one of them is the Autoencoder and its Denoising and Variational variants. These models are able to learn a representation of the data with missing values and generate plausible new ones to replace them. This study surveys the use of Autoencoders for the imputation of tabular data and considers 26 works published between 2014 and 2020. The analysis is mainly focused on discussing patterns and recommendations for the architecture, hyperparameters and training settings of the network, while providing a detailed discussion of the results obtained by Autoencoders when compared to other state-of-the-art methods, and of the data contexts where they have been applied. The conclusions include a set of recommendations for the technical settings of the network, and show that Denoising Autoencoders outperform their competitors, particularly the often used statistical methods.
引用
收藏
页码:1255 / 1285
页数:31
相关论文
共 50 条
  • [31] Multiple imputation for missing data
    Patrician, PA
    [J]. RESEARCH IN NURSING & HEALTH, 2002, 25 (01) : 76 - 84
  • [32] Imputation of missing data in surveys
    Rässler, S
    [J]. JAHRBUCHER FUR NATIONALOKONOMIE UND STATISTIK, 2000, 220 (01): : 64 - 94
  • [33] Multiple imputation of missing data
    Lydersen, Stian
    [J]. TIDSSKRIFT FOR DEN NORSKE LAEGEFORENING, 2022, 142 (02) : 151 - 151
  • [34] Missing Data Imputation With Bayesian Maximum Entropy for Internet of Things Applications
    Gonzalez-Vidal, Aurora
    Rathore, Punit
    Rao, Aravinda S.
    Mendoza-Bernal, Jose
    Palaniswami, Marimuthu
    Skarmeta-Gomez, Antonio F.
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (21) : 16108 - 16120
  • [35] Missing data imputation, matching and other applications of random recursive partitioning
    Iacus, Stefano A.
    Porro, Giuseppe
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (02) : 773 - 789
  • [36] Qualitative Imputation of Missing Potential Outcomes
    Coppock, Alexander
    Kaur, Dipin
    [J]. AMERICAN JOURNAL OF POLITICAL SCIENCE, 2022, 66 (03) : 681 - 695
  • [37] From Missing Data Imputation to Data Generation
    Neves, Diogo Telmo
    Alves, Joao
    Naik, Marcel Ganesh
    Proenca, Alberto Jose
    Prasser, Fabian
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2022, 61
  • [38] Data variability in the imputation quality of missing data
    Stochero, Elisandra Lucia Moro
    Lucio, Alessandro Dal'Col
    Jacobi, Luciane Flores
    [J]. ACTA SCIENTIARUM-AGRONOMY, 2024, 46
  • [39] Influence of Data Distribution in Missing Data Imputation
    Santos, Miriam Seoane
    Soares, Jastin Pompeu
    Abreu, Pedro Henriques
    Araujo, Helder
    Santos, Joao
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, AIME 2017, 2017, 10259 : 285 - 294
  • [40] Variable selection with missing data in both covariates and outcomes: Imputation and machine learning
    Hu, Liangyuan
    Lin, Jung-Yi Joyce
    Ji, Jiayi
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (12) : 2651 - 2671