Differentiable and Scalable Generative Adversarial Models for Data Imputation

被引:4
|
作者
Wu, Yangyang [1 ]
Wang, Jun [2 ]
Miao, Xiaoye [1 ]
Wang, Wenjia [2 ]
Yin, Jianwei [3 ]
机构
[1] Zhejiang Univ, Ctr Data Sci, Hangzhou 310058, Peoples R China
[2] Hong Kong Univ Sci & Technol, Kowloon, Hong Kong, Peoples R China
[3] Zhejiang Univ, Coll Comp Sci, Ctr Data Sci, Hangzhou 310058, Peoples R China
关键词
Data imputation; generative adversarial network; large-scale incomplete data; EFFICIENT;
D O I
10.1109/TKDE.2023.3293129
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data imputation has been extensively explored to solve the missing data problem. The dramatically increasing volume of incomplete data makes the imputation models computationally infeasible in many real-life applications. In this paper, we propose an effective scalable imputation system named SCIS to significantly speed up the training of the differentiable generative adversarial imputation models under accuracy-guarantees for large-scale incomplete data.SCIS consists of two modules, differentiable imputation modeling (DIM) and sample size estimation (SSE). DIM leverages a new masking Sinkhorn divergence function to make an arbitrary generative adversarial imputation model differentiable, while for such a differentiable imputation model, SSE can estimate an appropriate sample size to ensure the user-specified imputation accuracy of the final model. Moreover, SCIS can also accelerate the autoencoder based imputation models. Extensive experiments upon several real-life large-scale datasets demonstrate that, our proposed system can accelerate the generative adversarial model training by 6.23x. Using around 1.27% samples, SCIS yields competitive accuracy with the state-of-the-art imputation methods in much shorter computation time.
引用
收藏
页码:490 / 503
页数:14
相关论文
共 50 条
  • [1] Generative adversarial learning for missing data imputation
    Xinyang Wang
    Hongyu Chen
    Jiayu Zhang
    Jicong Fan
    Neural Computing and Applications, 2025, 37 (3) : 1403 - 1416
  • [2] A systematic review of generative adversarial imputation network in missing data imputation
    Yuqing Zhang
    Runtong Zhang
    Butian Zhao
    Neural Computing and Applications, 2023, 35 : 19685 - 19705
  • [3] A systematic review of generative adversarial imputation network in missing data imputation
    Zhang, Yuqing
    Zhang, Runtong
    Zhao, Butian
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (27): : 19685 - 19705
  • [4] Sequential Data Imputation with Evolving Generative Adversarial Networks
    Chakraborty, Haripriya
    Samanta, Priyanka
    Zhao, Liang
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [5] Improved generative adversarial imputation networks for missing data
    Qin, Xiwen
    Shi, Hongyu
    Dong, Xiaogang
    Zhang, Siqi
    Yuan, Liping
    APPLIED INTELLIGENCE, 2024, 54 (21) : 11068 - 11082
  • [6] Mixed Data Imputation Using Generative Adversarial Networks
    Khan, Wasif
    Zaki, Nazar
    Ahmad, Amir
    Masud, Mohammad Mehedy
    Ali, Luqman
    Ali, Nasloon
    Ahmed, Luai A.
    IEEE ACCESS, 2022, 10 : 124475 - 124490
  • [7] Data Imputation of Wind Turbine Using Generative Adversarial Nets with Deep Learning Models
    Qu, Fuming
    Liu, Jinhai
    Hong, Xiaowei
    Zhang, Yu
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT I, 2018, 11301 : 152 - 161
  • [8] Multiple Imputation by Generative Adversarial Networks for Classification with Incomplete Data
    Bao Ngoc Vi
    Dinh Tan Nguyen
    Cao Truong Tran
    Huu Phuc Ngo
    Chi Cong Nguyen
    Hai-Hong Phan
    2021 RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES (RIVF 2021), 2021, : 162 - 167
  • [9] GAGIN: generative adversarial guider imputation network for missing data
    Wei Wang
    Yimeng Chai
    Yue Li
    Neural Computing and Applications, 2022, 34 : 7597 - 7610
  • [10] GAIN: Missing Data Imputation using Generative Adversarial Nets
    Yoon, Jinsung
    Jordon, James
    van der Schaar, Mihaela
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80