Missing Data Imputation via Denoising Autoencoders: The Untold Story

被引:26
|
作者
Costa, Adriana Fonseca [1 ]
Santos, Miriam Seoane [1 ]
Soares, Jastin Pompeu [1 ]
Abreu, Pedro Henriques [1 ]
机构
[1] Univ Coimbra, Dept Informat Engn, CISUC, Coimbra, Portugal
关键词
Missing data; Missing mechanisms; Data imputation; Denoising autoencoders; SURVIVAL PREDICTION; INCOMPLETE DATA;
D O I
10.1007/978-3-030-01768-2_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Missing data consists in the lack of information in a dataset and since it directly influences classification performance, neglecting it is not a valid option. Over the years, several studies presented alternative imputation strategies to deal with the three missing data mechanisms, Missing Completely At Random, Missing At Random and Missing Not At Random. However, there are no studies regarding the influence of all these three mechanisms on the latest high-performance Artificial Intelligence techniques, such as Deep Learning. The goal of this work is to perform a comparison study between state-of-the-art imputation techniques and a Stacked Denoising Autoencoders approach. To that end, the missing data mechanisms were synthetically generated in 6 different ways; 8 different imputation techniques were implemented; and finally, 33 complete datasets from different open source repositories were selected. The obtained results showed that Support Vector Machines imputation ensures the best classification performance while Multiple Imputation by Chained Equations performs better in terms of imputation quality.
引用
收藏
页码:87 / 98
页数:12
相关论文
共 50 条
  • [1] MIDIA: exploring denoising autoencoders for missing data imputation
    Qian Ma
    Wang-Chien Lee
    Tao-Yang Fu
    Yu Gu
    Ge Yu
    [J]. Data Mining and Knowledge Discovery, 2020, 34 : 1859 - 1897
  • [2] MIDIA: exploring denoising autoencoders for missing data imputation
    Ma, Qian
    Lee, Wang-Chien
    Fu, Tao-Yang
    Gu, Yu
    Yu, Ge
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (06) : 1859 - 1897
  • [3] Missing value imputation in food composition data with denoising autoencoders
    Gjorshoska, Ivana
    Eftimov, Tome
    Trajanov, Dimitar
    [J]. JOURNAL OF FOOD COMPOSITION AND ANALYSIS, 2022, 112
  • [4] Imputation of Missing Traffic Flow Data Using Denoising Autoencoders
    Jiang, Boyuan
    Siddiqi, Muhammad Danial
    Asadi, Reza
    Regan, Amelia
    [J]. 12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 84 - 91
  • [5] Missing-Data Imputation With Position-Encoding Denoising Autoencoders for Industrial Processes
    Ou, Chen
    Zhu, Hongqiu
    Shardt, Yuri A. W.
    Ye, Lingjian
    Yuan, Xiaofeng
    Wang, Yalin
    Yang, Chunhua
    Gui, Weihua
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [6] Hyperparameter Tuning to Optimize Implementations of Denoising Autoencoders for Imputation of Missing Spatio-temporal Data
    Siddiqi, Muhammad Danial
    Jiang, Boyuan
    Asadi, Reza
    Regan, Amelia
    [J]. 12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 107 - 114
  • [7] Non-linear missing data imputation for healthcare data via index-aware autoencoders
    Sadaf Kabir
    Leily Farrokhvar
    [J]. Health Care Management Science, 2022, 25 : 484 - 497
  • [8] Non-linear missing data imputation for healthcare data via index-aware autoencoders
    Kabir, Sadaf
    Farrokhvar, Leily
    [J]. HEALTH CARE MANAGEMENT SCIENCE, 2022, 25 (03) : 484 - 497
  • [9] Physiological Waveform Imputation of Missing Data using Convolutional Autoencoders
    Miller, Daniel
    Ward, Andrew
    Bambos, Nicholas
    Scheinker, David
    Shin, Andrew
    [J]. 2018 IEEE 20TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM), 2018,
  • [10] Variational Autoencoders for Missing Data Imputation with Application to a Simulated Milling Circuit
    McCoy, John T.
    Kroon, Steve
    Auret, Lidia
    [J]. IFAC PAPERSONLINE, 2018, 51 (21): : 141 - 146