PRIVIMAGE: Differentially Private Synthetic Image Generation using Diffusion Models with Semantic-Aware Pretraining

被引:0
|
作者
Li, Kecen [1 ,4 ]
Gong, Chen [2 ]
Li, Zhixiang [3 ]
Zhao, Yuzhong [4 ]
Hou, Xinwen [1 ]
Wang, Tianhao [2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[2] Univ Virginia, Charlottesville, VA USA
[3] Univ Bristol, Bristol, Avon, England
[4] Univ Chinese Acad Sci, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Differential Privacy (DP) image data synthesis, which leverages the DP technique to generate synthetic data to replace the sensitive data, allowing organizations to share and utilize synthetic images without privacy concerns. Previous methods incorporate the advanced techniques of generative models and pre-training on a public dataset to produce exceptional DP image data, but suffer from problems of unstable training and massive computational resource demands. This paper proposes a novel DP image synthesis method, termed PRIVIMAGE, which meticulously selects pre-training data, promoting the efficient creation of DP datasets with high fidelity and utility. PRIVIMAGE first establishes a semantic query function using a public dataset. Then, this function assists in querying the semantic distribution of the sensitive dataset, facilitating the selection of data from the public dataset with analogous semantics for pre-training. Finally, we pre-train an image generative model using the selected data and then fine-tune this model on the sensitive dataset using Differentially Private Stochastic Gradient Descent (DP-SGD). PRIVIMAGE allows us to train a lightly parameterized generative model, reducing the noise in the gradient during DP-SGD training and enhancing training stability. Extensive experiments demonstrate that PRIVIMAGE uses only 1% of the public dataset for pre-training and 7.6% of the parameters in the generative model compared to the state-of-the-art method, whereas achieves superior synthetic performance and conserves more computational resources. On average, PRIVIMAGE achieves 6.8% lower FID and 13.2% higher Classification Accuracy than the state-of-the-art method. The replication package and datasets can be accessed online(1).
引用
收藏
页码:4837 / 4854
页数:18
相关论文
共 38 条
  • [1] Local and Global GANs With Semantic-Aware Upsampling for Image Generation
    Tang, Hao
    Shao, Ling
    Torr, Philip H. S.
    Sebe, Nicu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 768 - 784
  • [2] NASDM: Nuclei-Aware Semantic Histopathology Image Generation Using Diffusion Models
    Shrivastava, Aman
    Fletcher, P. Thomas
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VI, 2023, 14225 : 786 - 796
  • [3] Next Generation Assisting Clinical Applications by using Semantic-aware Electronic Health Records
    De Potter, Pieterjan
    Debevere, Pedro
    Mannens, Erik
    Van de Walle, Rik
    2009 22ND IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, 2009, : 19 - 23
  • [4] Semantic-Aware Generator and Low-level Feature Augmentation for Few-shot Image Generation
    Wang, Zhe
    Guan, Jiaoyan
    Yang, Mengping
    Xiao, Ting
    Chi, Ziqiu
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5079 - 5088
  • [5] SAW: Semantic-Aware WebRTC Transmission Using Diffusion-Based Scalable Video Coding
    Wen, Yihan
    Zhang, Zheng
    Sun, Jiayi
    Li, Jinglei
    Chen, Chung Shue
    Niu, Guanchong
    IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (05): : 5346 - 5359
  • [6] SAW: Semantic-Aware WebRTC Transmission Using Diffusion-Based Scalable Video Coding
    Wen, Yihan
    Zhang, Zheng
    Sun, Jiayi
    Li, Jinglei
    Chen, Chung Shue
    Niu, Guanchong
    IEEE Internet of Things Journal, 2024,
  • [7] Differentially private synthetic medical data generation using convolutional GANs
    Torfi, Amirsina
    Fox, Edward A.
    Reddy, Chandan K.
    INFORMATION SCIENCES, 2022, 586 : 485 - 500
  • [8] Synthetic Water Crystal Image Generation Using VAE-GANs and Diffusion Models
    Aymen, Farah
    Pester, Andreas
    Andres, Frederic
    SMART MOBILE COMMUNICATION & ARTIFICIAL INTELLIGENCE, VOL 1, IMCL 2023, 2024, 936 : 95 - 104
  • [9] Echo from Noise: Synthetic Ultrasound Image Generation Using Diffusion Models for Real Image Segmentation
    Stojanovski, David
    Hermida, Uxio
    Lamata, Pablo
    Beqiri, Arian
    Gomez, Alberto
    SIMPLIFYING MEDICAL ULTRASOUND, ASMUS 2023, 2023, 14337 : 34 - 43
  • [10] 3D-aware Image Generation using 2D Diffusion Models
    Xiang, Jianfeng
    Yang, Jiaolong
    Huang, Binbin
    Tong, Xin
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2383 - 2393