Differentiable and Scalable Generative Adversarial Models for Data Imputation

被引:4
|
作者
Wu, Yangyang [1 ]
Wang, Jun [2 ]
Miao, Xiaoye [1 ]
Wang, Wenjia [2 ]
Yin, Jianwei [3 ]
机构
[1] Zhejiang Univ, Ctr Data Sci, Hangzhou 310058, Peoples R China
[2] Hong Kong Univ Sci & Technol, Kowloon, Hong Kong, Peoples R China
[3] Zhejiang Univ, Coll Comp Sci, Ctr Data Sci, Hangzhou 310058, Peoples R China
关键词
Data imputation; generative adversarial network; large-scale incomplete data; EFFICIENT;
D O I
10.1109/TKDE.2023.3293129
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data imputation has been extensively explored to solve the missing data problem. The dramatically increasing volume of incomplete data makes the imputation models computationally infeasible in many real-life applications. In this paper, we propose an effective scalable imputation system named SCIS to significantly speed up the training of the differentiable generative adversarial imputation models under accuracy-guarantees for large-scale incomplete data.SCIS consists of two modules, differentiable imputation modeling (DIM) and sample size estimation (SSE). DIM leverages a new masking Sinkhorn divergence function to make an arbitrary generative adversarial imputation model differentiable, while for such a differentiable imputation model, SSE can estimate an appropriate sample size to ensure the user-specified imputation accuracy of the final model. Moreover, SCIS can also accelerate the autoencoder based imputation models. Extensive experiments upon several real-life large-scale datasets demonstrate that, our proposed system can accelerate the generative adversarial model training by 6.23x. Using around 1.27% samples, SCIS yields competitive accuracy with the state-of-the-art imputation methods in much shorter computation time.
引用
收藏
页码:490 / 503
页数:14
相关论文
共 50 条
  • [31] Trend-Aware Data Imputation Based on Generative Adversarial Network for Time Series
    Li, Han
    Liu, Zhenxiong
    Niu, Jixiang
    Yang, Zhongguo
    Ali, Sikandar
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2023, 16 (03)
  • [32] MDTGAN: Multi domain generative adversarial transfer learning network for traffic data imputation
    Fang, Jie
    He, Hangyu
    Xu, Mengyun
    Chen, Hongting
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [33] Joint Representation Learning with Generative Adversarial Imputation Network for Improved Classification of Longitudinal Data
    Pingi, Sharon Torao
    Zhang, Duoyi
    Bashar, Md Abul
    Nayak, Richi
    DATA SCIENCE AND ENGINEERING, 2024, 9 (01) : 5 - 25
  • [34] QAR Data Imputation Using Generative Adversarial Network with Self-Attention Mechanism
    Zhao, Jingqi
    Rong, Chuitian
    Dang, Xin
    Sun, Huabo
    BIG DATA MINING AND ANALYTICS, 2024, 7 (01): : 12 - 28
  • [35] Multistate time series imputation using generative adversarial network with applications to traffic data
    Haitao Li
    Qian Cao
    Qiaowen Bai
    Zhihui Li
    Hongyu Hu
    Neural Computing and Applications, 2023, 35 : 6545 - 6567
  • [36] Federated conditional generative adversarial nets imputation method for air quality missing data
    Zhou, Xu
    Liu, Xiaofeng
    Lan, Gongjin
    Wu, Jian
    KNOWLEDGE-BASED SYSTEMS, 2021, 228
  • [37] Improved generative adversarial network with bald eagle search optimization for missing data imputation
    Xiwen Qin
    Hongyu Shi
    Xiaogang Dong
    Siqi Zhang
    Liping Yuan
    Sijia Guo
    Earth Science Informatics, 2025, 18 (2)
  • [38] DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation
    Choi, Joung Min
    Ji, Ming
    Watson, Layne T.
    Zhang, Liqing
    BIOINFORMATICS, 2023, 39 (05)
  • [39] Multivariate Time Series Imputation with Generative Adversarial Networks
    Luo, Yonghong
    Cai, Xiangrui
    Zhang, Ying
    Xu, Jun
    Yuan, Xiaojie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [40] CAGAIN: Column Attention Generative Adversarial Imputation Networks
    Kawagoshi, Jun
    Dong, Yuyang
    Nozawa, Takuma
    Xiao, Chuan
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2023, PT II, 2023, 14147 : 258 - 273