Differentiable and Scalable Generative Adversarial Models for Data Imputation

被引:4
|
作者
Wu, Yangyang [1 ]
Wang, Jun [2 ]
Miao, Xiaoye [1 ]
Wang, Wenjia [2 ]
Yin, Jianwei [3 ]
机构
[1] Zhejiang Univ, Ctr Data Sci, Hangzhou 310058, Peoples R China
[2] Hong Kong Univ Sci & Technol, Kowloon, Hong Kong, Peoples R China
[3] Zhejiang Univ, Coll Comp Sci, Ctr Data Sci, Hangzhou 310058, Peoples R China
关键词
Data imputation; generative adversarial network; large-scale incomplete data; EFFICIENT;
D O I
10.1109/TKDE.2023.3293129
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data imputation has been extensively explored to solve the missing data problem. The dramatically increasing volume of incomplete data makes the imputation models computationally infeasible in many real-life applications. In this paper, we propose an effective scalable imputation system named SCIS to significantly speed up the training of the differentiable generative adversarial imputation models under accuracy-guarantees for large-scale incomplete data.SCIS consists of two modules, differentiable imputation modeling (DIM) and sample size estimation (SSE). DIM leverages a new masking Sinkhorn divergence function to make an arbitrary generative adversarial imputation model differentiable, while for such a differentiable imputation model, SSE can estimate an appropriate sample size to ensure the user-specified imputation accuracy of the final model. Moreover, SCIS can also accelerate the autoencoder based imputation models. Extensive experiments upon several real-life large-scale datasets demonstrate that, our proposed system can accelerate the generative adversarial model training by 6.23x. Using around 1.27% samples, SCIS yields competitive accuracy with the state-of-the-art imputation methods in much shorter computation time.
引用
收藏
页码:490 / 503
页数:14
相关论文
共 50 条
  • [41] Spatiotemporal Generative Adversarial Imputation Networks: An Approach to Address Missing Data for Wind Turbines
    Hu, Xuguang
    Zhan, Zhaokang
    Ma, Dazhong
    Zhang, Siqi
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [42] Traffic data imputation via knowledge graph-enhanced generative adversarial network
    Liu, Yinghui
    Shen, Guojiang
    Liu, Nali
    Han, Xiao
    Xu, Zhenhui
    Zhou, Junjie
    Kong, Xiangjie
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [43] Rear-end Crash Data Imputation Methods Using Generative Adversarial Networks
    Zhou B.
    Zhang Y.
    Zhang S.
    Zhou Q.
    Wang Q.
    Jiaotong Yunshu Xitong Gongcheng Yu Xinxi/Journal of Transportation Systems Engineering and Information Technology, 2024, 24 (01): : 132 - 137and198
  • [44] Multistate time series imputation using generative adversarial network with applications to traffic data
    Li, Haitao
    Cao, Qian
    Bai, Qiaowen
    Li, Zhihui
    Hu, Hongyu
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (09): : 6545 - 6567
  • [45] Well log data generation and imputation using sequence based generative adversarial networks
    Al-Fakih, Abdulrahman
    Koeshidayatullah, A.
    Mukerji, Tapan
    Al-Azani, Sadam
    Kaka, SanLinn I.
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [46] Joint Representation Learning with Generative Adversarial Imputation Network for Improved Classification of Longitudinal Data
    Sharon Torao Pingi
    Duoyi Zhang
    Md Abul Bashar
    Richi Nayak
    Data Science and Engineering, 2024, 9 : 5 - 25
  • [47] VIGAN: Missing View Imputation with Generative Adversarial Networks
    Shang, Chao
    Palmer, Aaron
    Sun, Jiangwen
    Chen, Ko-Shin
    Lu, Jin
    Bi, Jinbo
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 766 - 775
  • [48] CN-GAIN: Classification and NormalizationDenormalization-Based Generative Adversarial Imputation Network for Missing SMES Data Imputation
    Sudrajat, Antonius Wahyu
    Ermatita
    Samsuryadi
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2025, 16 (01) : 314 - 322
  • [49] Ensemble Generative Adversarial Imputation Network with Selective Multi-Generator (ESM-GAIN) for Missing Data Imputation
    Li, Yuxuan
    Dogan, Ayse
    Liu, Chenang
    2022 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2022, : 807 - 812
  • [50] LFM-D2GAIN: An Improved Missing Data Imputation Method Based on Generative Adversarial Imputation Nets
    Shen, Yebai
    Zhang, Chao
    Zhang, Songyu
    Yan, Jinghua
    Bu, Fanliang
    2022 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, BIG DATA AND ALGORITHMS (EEBDA), 2022, : 447 - 453