PRIVIMAGE: Differentially Private Synthetic Image Generation using Diffusion Models with Semantic-Aware Pretraining

被引:0
|
作者
Li, Kecen [1 ,4 ]
Gong, Chen [2 ]
Li, Zhixiang [3 ]
Zhao, Yuzhong [4 ]
Hou, Xinwen [1 ]
Wang, Tianhao [2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[2] Univ Virginia, Charlottesville, VA USA
[3] Univ Bristol, Bristol, Avon, England
[4] Univ Chinese Acad Sci, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Differential Privacy (DP) image data synthesis, which leverages the DP technique to generate synthetic data to replace the sensitive data, allowing organizations to share and utilize synthetic images without privacy concerns. Previous methods incorporate the advanced techniques of generative models and pre-training on a public dataset to produce exceptional DP image data, but suffer from problems of unstable training and massive computational resource demands. This paper proposes a novel DP image synthesis method, termed PRIVIMAGE, which meticulously selects pre-training data, promoting the efficient creation of DP datasets with high fidelity and utility. PRIVIMAGE first establishes a semantic query function using a public dataset. Then, this function assists in querying the semantic distribution of the sensitive dataset, facilitating the selection of data from the public dataset with analogous semantics for pre-training. Finally, we pre-train an image generative model using the selected data and then fine-tune this model on the sensitive dataset using Differentially Private Stochastic Gradient Descent (DP-SGD). PRIVIMAGE allows us to train a lightly parameterized generative model, reducing the noise in the gradient during DP-SGD training and enhancing training stability. Extensive experiments demonstrate that PRIVIMAGE uses only 1% of the public dataset for pre-training and 7.6% of the parameters in the generative model compared to the state-of-the-art method, whereas achieves superior synthetic performance and conserves more computational resources. On average, PRIVIMAGE achieves 6.8% lower FID and 13.2% higher Classification Accuracy than the state-of-the-art method. The replication package and datasets can be accessed online(1).
引用
收藏
页码:4837 / 4854
页数:18
相关论文
共 38 条
  • [21] Predictive microstructure image generation using denoising diffusion probabilistic models
    Azqadan, Erfan
    Jahed, Hamid
    Arami, Arash
    ACTA MATERIALIA, 2023, 261
  • [22] Using diffusion models to generate synthetic labeled data for medical image segmentation
    Saragih, Daniel G.
    Hibi, Atsuhiro
    Tyrrell, Pascal N.
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2024, 19 (08) : 1615 - 1625
  • [23] Synthetic Image Data Generation for Semantic Understanding in Everchanging Scenes Using BIM and Unreal Engine
    Wei, Yujie
    Akinci, Burcu
    COMPUTING IN CIVIL ENGINEERING 2021, 2022, : 934 - 941
  • [24] DeepDSAIR: Deep 6-DOF camera relocalization using deblurred semantic-aware image representation for large-scale outdoor environments
    Esfahani, Mandi Abolfazli
    Wu, Keyu
    Yuan, Shenghai
    Wang, Han
    IMAGE AND VISION COMPUTING, 2019, 89 : 120 - 130
  • [25] Few-shot biomedical image segmentation using diffusion models: Beyond image generation
    Khosravi, Bardia
    Rouzrokh, Pouria
    Mickley, John P.
    Faghani, Shahriar
    Mulford, Kellen
    Yang, Linjun
    Larson, A. Noelle
    Howe, Benjamin M.
    Erickson, Bradley J.
    Taunton, Michael J.
    Wyles, Cody C.
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2023, 242
  • [26] Socially Aware Synthetic Data Generation for Suicidal Ideation Detection Using Large Language Models
    Ghanadian, Hamideh
    Nejadgholi, Isar
    Al Osman, Hussein
    IEEE ACCESS, 2024, 12 : 14350 - 14363
  • [27] Instant 3D Human Avatar Generation Using Image Diffusion Models
    Kolotouros, Nikos
    Alldiecke, Thiemo
    Corona, Enric
    Bazavan, Eduard Gabriel
    Sminchisescu, Cristian
    COMPUTER VISION - ECCV 2024, PT LXXXVII, 2025, 15145 : 177 - 195
  • [28] RGB⇆X: Image decomposition and synthesis using material- and lighting-aware diffusion models
    Zeng, Zheng
    Deschaintre, Valentin
    Georgiev, Iliyan
    Hold-Geoffroy, Yannick
    Hu, Yiwei
    Luan, Fujun
    Yan, Ling-Qi
    Hasan, Milos
    PROCEEDINGS OF SIGGRAPH 2024 CONFERENCE PAPERS, 2024,
  • [29] CBCT-Based synthetic CT image generation using conditional denoising diffusion probabilistic model
    Peng, Junbo
    Qiu, Richard L. J.
    Wynne, Jacob F.
    Chang, Chih-Wei
    Pan, Shaoyan
    Wang, Tonghe
    Roper, Justin
    Liu, Tian
    Patel, Pretesh R.
    Yu, David S.
    Yang, Xiaofeng
    MEDICAL PHYSICS, 2024, 51 (03) : 1847 - 1859
  • [30] Enhancing ML model accuracy for Digital VLSI circuits using diffusion models: A study on synthetic data generation
    Srivastava, Prasha
    Kumar, Pawan
    Abbas, Zia
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,