PRIVIMAGE: Differentially Private Synthetic Image Generation using Diffusion Models with Semantic-Aware Pretraining

被引:0
|
作者
Li, Kecen [1 ,4 ]
Gong, Chen [2 ]
Li, Zhixiang [3 ]
Zhao, Yuzhong [4 ]
Hou, Xinwen [1 ]
Wang, Tianhao [2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[2] Univ Virginia, Charlottesville, VA USA
[3] Univ Bristol, Bristol, Avon, England
[4] Univ Chinese Acad Sci, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Differential Privacy (DP) image data synthesis, which leverages the DP technique to generate synthetic data to replace the sensitive data, allowing organizations to share and utilize synthetic images without privacy concerns. Previous methods incorporate the advanced techniques of generative models and pre-training on a public dataset to produce exceptional DP image data, but suffer from problems of unstable training and massive computational resource demands. This paper proposes a novel DP image synthesis method, termed PRIVIMAGE, which meticulously selects pre-training data, promoting the efficient creation of DP datasets with high fidelity and utility. PRIVIMAGE first establishes a semantic query function using a public dataset. Then, this function assists in querying the semantic distribution of the sensitive dataset, facilitating the selection of data from the public dataset with analogous semantics for pre-training. Finally, we pre-train an image generative model using the selected data and then fine-tune this model on the sensitive dataset using Differentially Private Stochastic Gradient Descent (DP-SGD). PRIVIMAGE allows us to train a lightly parameterized generative model, reducing the noise in the gradient during DP-SGD training and enhancing training stability. Extensive experiments demonstrate that PRIVIMAGE uses only 1% of the public dataset for pre-training and 7.6% of the parameters in the generative model compared to the state-of-the-art method, whereas achieves superior synthetic performance and conserves more computational resources. On average, PRIVIMAGE achieves 6.8% lower FID and 13.2% higher Classification Accuracy than the state-of-the-art method. The replication package and datasets can be accessed online(1).
引用
收藏
页码:4837 / 4854
页数:18
相关论文
共 38 条
  • [31] SYNTHETIC DATA GENERATION MODELS AND ALGORITHMS FOR TRAINING IMAGE RECOGNITION ALGORITHMS USING THE VIOLA-JONES FRAMEWORK
    Akimov, A. V.
    Sirota, A. A.
    COMPUTER OPTICS, 2016, 40 (06) : 911 - 918
  • [32] CBCT-based synthetic CT image generation using a diffusion model for CBCT-guided lung radiotherapy
    Chen, Xiaoqian
    Qiu, Richard L. J.
    Peng, Junbo
    Shelton, Joseph W.
    Chang, Chih-Wei
    Yang, Xiaofeng
    Kesarwala, Aparna H.
    MEDICAL PHYSICS, 2024, 51 (11) : 8168 - 8178
  • [33] Generation of synthetic whole-slide image tiles of tumours from RNA-sequencing data via cascaded diffusion models
    Carrillo-Perez, Francisco
    Pizurica, Marija
    Zheng, Yuanning
    Nandi, Tarak Nath
    Madduri, Ravi
    Shen, Jeanne
    Gevaert, Olivier
    NATURE BIOMEDICAL ENGINEERING, 2024, 8 (05) : 320 - 332
  • [34] Synthetic Singleplex-Image Generation in Multiplex-Brightfield Immunohistochemistry Digital Pathology Using Deep Generative Models
    Lorsakul, Auranuch
    Martin, Jim
    Landowski, Terry
    Walker, Erika
    Flores, Mike
    Clements, June
    Olson, Matthew
    Ferreri, Gianni
    SIMULATION AND SYNTHESIS IN MEDICAL IMAGING, SASHIMI 2023, 2023, 14288 : 107 - 117
  • [35] Synthetic Data Generation using Diffusion Models for ML-based Lightpath Quality of Transmission Estimation Under Extreme Data Scarcity
    Andreoletti, Davide
    Rottondi, Cristina
    Ayoub, Omran
    Bianco, Andrea
    2024 24TH INTERNATIONAL CONFERENCE ON TRANSPARENT OPTICAL NETWORKS, ICTON 2024, 2024,
  • [36] BuilDiff: 3D Building Shape Generation using Single-Image Conditional Point Cloud Diffusion Models
    Wei, Yao
    Vosselm, George
    Yang, Michael Ying
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 2902 - 2911
  • [37] CAN GENERATIVE AI MODELS COUNT? Finetuning Stable Diffusion for Architecture Image Generation with Designated Floor Numbers Using a Small Dataset
    Xu, Weishun
    Li, Mingming
    Yang, Xuyou
    PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE OF THE ASSOCIATION FOR COMPUTER-AIDED ARCHITECTURAL DESIGN RESEARCH IN ASIA, CAADRIA 2024, VOL 1, 2024, : 89 - 98
  • [38] Generation of synthetic PET/MR fusion images from MR images using a combination of generative adversarial networks and conditional denoising diffusion probabilistic models based on simultaneous 18F-FDG PET/MR image data of pyogenic spondylodiscitis
    Jung, Euijin
    Kong, Eunjung
    Yu, Dongwoo
    Yang, Heesung
    Chicontwe, Philip
    Park, Sang Hyun
    Jeon, Ikchan
    SPINE JOURNAL, 2024, 24 (08): : 1467 - 1477