MISSING DATA IMPUTATION FOR HEALTH CARE BIG DATA USING DENOISING AUTOENCODER WITH GENERATIVE ADVERSARIAL NETWORK

被引:0
|
作者
Zhang, Yinbing [1 ]
机构
[1] Hubu Univ, Coll Chem & Chem Engn, Wuhan 430062, Hubei, Peoples R China
来源
关键词
Data imputation; missing data; Autoencoders; GAN; Deep learning;
D O I
10.12694/scpe.v25i5.3023
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Missing data imputation is a key topic in healthcare that covers the issues and strategies involved in dealing with partial data in medical records, clinical trials, and health surveys. Data in healthcare might be missing for a variety of reasons, including non-response in surveys, data entry problems, or unrecorded information during therapeutic appointments. This paper introduces a novel approach to impute missing data utilizing a hybrid model that integrates denoising autoencoders with generative adversarial networks (GANs). We begin by highlighting the prevalence of missing data in health care datasets and the potential impact on analytical outcomes. The proposed methodology leverages the denoising autoencoder's ability to reconstruct data from noisy inputs, coupled with the GAN's proficiency in generating synthetic data that is indistinguishable from real data. By combining these two neural network architectures, our model demonstrates an enhanced capability to predict and fill in missing data points effectively. To validate our approach, we conducted experiments on several large-scale health care datasets with varying degrees of artificially introduced missingness. The performance of our model was benchmarked against traditional imputation methods such as mean imputation and k-nearest neighbors, as well as against standalone denoising autoencoders and GANs. Our results indicate a significant improvement in imputation accuracy, as measured by root mean square error (RMSE) and mean absolute error (MAE), confirming the efficacy of the hybrid model in handling missing data in a robust manner.
引用
收藏
页码:3850 / 3857
页数:8
相关论文
共 50 条
  • [1] A systematic review of generative adversarial imputation network in missing data imputation
    Yuqing Zhang
    Runtong Zhang
    Butian Zhao
    [J]. Neural Computing and Applications, 2023, 35 : 19685 - 19705
  • [2] A systematic review of generative adversarial imputation network in missing data imputation
    Zhang, Yuqing
    Zhang, Runtong
    Zhao, Butian
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (27): : 19685 - 19705
  • [3] GAGIN: generative adversarial guider imputation network for missing data
    Wei Wang
    Yimeng Chai
    Yue Li
    [J]. Neural Computing and Applications, 2022, 34 : 7597 - 7610
  • [4] GAGIN: generative adversarial guider imputation network for missing data
    Wang, Wei
    Chai, Yimeng
    Li, Yue
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (10): : 7597 - 7610
  • [5] Generative adversarial learning for missing data imputation
    Xinyang Wang
    Hongyu Chen
    Jiayu Zhang
    Jicong Fan
    [J]. Neural Computing and Applications, 2025, 37 (3) : 1403 - 1416
  • [6] GAIN: Missing Data Imputation using Generative Adversarial Nets
    Yoon, Jinsung
    Jordon, James
    van der Schaar, Mihaela
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [7] DEGAIN: Generative-Adversarial-Network-Based Missing Data Imputation
    Shahbazian, Reza
    Trubitsyna, Irina
    [J]. INFORMATION, 2022, 13 (12)
  • [8] Missing Data Imputation Method Combining Random Forest and Generative Adversarial Imputation Network
    Ou, Hongsen
    Yao, Yunan
    He, Yi
    [J]. SENSORS, 2024, 24 (04)
  • [9] Bathymetric Data Processing based on Denoising Autoencoder Wasserstein Generative Adversarial Network
    Zhang, Ruichen
    Chen, Yongbing
    Bian, Shaofeng
    Gao, Duanyang
    [J]. GLOBAL INTELLIGENCE INDUSTRY CONFERENCE (GIIC 2018), 2018, 10835
  • [10] Improved generative adversarial imputation networks for missing data
    Qin, Xiwen
    Shi, Hongyu
    Dong, Xiaogang
    Zhang, Siqi
    Yuan, Liping
    [J]. APPLIED INTELLIGENCE, 2024, 54 (21) : 11068 - 11082