Differentiable and Scalable Generative Adversarial Models for Data Imputation

被引：4

作者：

Wu, Yangyang ^{[1
]}

Wang, Jun ^{[2
]}

Miao, Xiaoye ^{[1
]}

Wang, Wenjia ^{[2
]}

Yin, Jianwei ^{[3
]}

机构：

[1] Zhejiang Univ, Ctr Data Sci, Hangzhou 310058, Peoples R China

[2] Hong Kong Univ Sci & Technol, Kowloon, Hong Kong, Peoples R China

[3] Zhejiang Univ, Coll Comp Sci, Ctr Data Sci, Hangzhou 310058, Peoples R China

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2024年 / 36卷 / 02期

关键词：

Data imputation; generative adversarial network; large-scale incomplete data; EFFICIENT;

D O I：

10.1109/TKDE.2023.3293129

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Data imputation has been extensively explored to solve the missing data problem. The dramatically increasing volume of incomplete data makes the imputation models computationally infeasible in many real-life applications. In this paper, we propose an effective scalable imputation system named SCIS to significantly speed up the training of the differentiable generative adversarial imputation models under accuracy-guarantees for large-scale incomplete data.SCIS consists of two modules, differentiable imputation modeling (DIM) and sample size estimation (SSE). DIM leverages a new masking Sinkhorn divergence function to make an arbitrary generative adversarial imputation model differentiable, while for such a differentiable imputation model, SSE can estimate an appropriate sample size to ensure the user-specified imputation accuracy of the final model. Moreover, SCIS can also accelerate the autoencoder based imputation models. Extensive experiments upon several real-life large-scale datasets demonstrate that, our proposed system can accelerate the generative adversarial model training by 6.23x. Using around 1.27% samples, SCIS yields competitive accuracy with the state-of-the-art imputation methods in much shorter computation time.

引用

页码：490 / 503

页数：14

共 50 条

[1] Generative adversarial learning for missing data imputation
Xinyang Wang
Hongyu Chen
Jiayu Zhang
Jicong Fan
Neural Computing and Applications, 2025, 37 (3) : 1403 - 1416
[2] A systematic review of generative adversarial imputation network in missing data imputation
Yuqing Zhang
Runtong Zhang
Butian Zhao
Neural Computing and Applications, 2023, 35 : 19685 - 19705
[3] A systematic review of generative adversarial imputation network in missing data imputation
Zhang, Yuqing
Zhang, Runtong
Zhao, Butian
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (27): : 19685 - 19705
[4] Sequential Data Imputation with Evolving Generative Adversarial Networks
Chakraborty, Haripriya
Samanta, Priyanka
Zhao, Liang
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[5] Improved generative adversarial imputation networks for missing data
Qin, Xiwen
Shi, Hongyu
Dong, Xiaogang
Zhang, Siqi
Yuan, Liping
APPLIED INTELLIGENCE, 2024, 54 (21) : 11068 - 11082
[6] Mixed Data Imputation Using Generative Adversarial Networks
Khan, Wasif
Zaki, Nazar
Ahmad, Amir
Masud, Mohammad Mehedy
Ali, Luqman
Ali, Nasloon
Ahmed, Luai A.
IEEE ACCESS, 2022, 10 : 124475 - 124490
[7] Data Imputation of Wind Turbine Using Generative Adversarial Nets with Deep Learning Models
Qu, Fuming
Liu, Jinhai
Hong, Xiaowei
Zhang, Yu
NEURAL INFORMATION PROCESSING (ICONIP 2018), PT I, 2018, 11301 : 152 - 161
[8] Multiple Imputation by Generative Adversarial Networks for Classification with Incomplete Data
Bao Ngoc Vi
Dinh Tan Nguyen
Cao Truong Tran
Huu Phuc Ngo
Chi Cong Nguyen
Hai-Hong Phan
2021 RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES (RIVF 2021), 2021, : 162 - 167
[9] GAGIN: generative adversarial guider imputation network for missing data
Wei Wang
Yimeng Chai
Yue Li
Neural Computing and Applications, 2022, 34 : 7597 - 7610
[10] GAIN: Missing Data Imputation using Generative Adversarial Nets
Yoon, Jinsung
Jordon, James
van der Schaar, Mihaela
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80

← 1 2 3 4 5 →