P3GM: Private High-Dimensional Data Release via Privacy Preserving Phased Generative Model

被引:15
|
作者
Takagi, Shun [1 ,2 ]
Takahashi, Tsubasa [2 ]
Cao, Yang [1 ]
Yoshikawa, Masatoshi [1 ]
机构
[1] Kyoto Univ, Kyoto, Japan
[2] LINE Corp, Toronto, ON, Canada
关键词
differential privacy; variational autoencoder; generative model; privacy preserving data synthesis; MIXTURE;
D O I
10.1109/ICDE51399.2021.00022
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
How can we release a massive volume of sensitive data while mitigating privacy risks? Privacy-preserving data synthesis enables the data holder to outsource analytical tasks to an untrusted third party. The state-of-the-art approach for this problem is to build a generative model under differential privacy, which offers a rigorous privacy guarantee. However, the existing method cannot adequately handle high dimensional data. In particular, when the input dataset contains a large number of features, the existing techniques require injecting a prohibitive amount of noise to satisfy differential privacy, which results in the outsourced data analysis meaningless. To address the above issue, this paper proposes privacy-preserving phased generative model (P3GM), which is a differentially private generative model for releasing such sensitive data. P3GM employs the two-phase learning process to make it robust against the noise, and to increase learning efficiency (e.g., easy to converge). We give theoretical analyses about the learning complexity and privacy loss in P3GM. We further experimentally evaluate our proposed method and demonstrate that P3GM significantly outperforms existing solutions. Compared with the state-of-the-art methods, our generated samples look fewer noises and closer to the original data in terms of data diversity. Besides, in several data mining tasks with synthesized data, our model outperforms the competitors in terms of accuracy.
引用
收藏
页码:169 / 180
页数:12
相关论文
共 32 条
  • [1] Preserving Privacy of Continuous High-dimensional Data with Minimax Filters
    Hamm, Jihun
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 324 - 332
  • [2] A divide-and-conquer approach to privacy-preserving high-dimensional big data release
    Wang, Rong
    Liang, Junchuan
    Wang, Siyu
    Chang, Chin-Chen
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2024, 83
  • [3] DPPro: Differentially Private High-Dimensional Data Release via Random Projection
    Xu, Chugui
    Ren, Ju
    Zhang, Yaoxue
    Qin, Zhan
    Ren, Kui
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2017, 12 (12) : 3081 - 3093
  • [4] Privacy-preserving high-dimensional data publishing for classification
    Wang, Rong
    Zhu, Yan
    Chang, Chin-Chen
    Peng, Qiang
    COMPUTERS & SECURITY, 2020, 93
  • [5] Fusion: Privacy-preserving Distributed Protocol for High-Dimensional Data Mashup
    Dagher, Gaby G.
    Iqbal, Farkhund
    Arafati, Mahtab
    Fung, Benjamin C. M.
    2015 IEEE 21ST INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2015, : 760 - 769
  • [6] Preserving Privacy of High-Dimensional Data by l-Diverse Constrained Slicing
    Amin, Zenab
    Anjum, Adeel
    Khan, Abid
    Ahmad, Awais
    Jeon, Gwanggil
    ELECTRONICS, 2022, 11 (08)
  • [7] Differentially Private High-Dimensional Data Publication via Markov Network
    Wei, Fengqiong
    Zhang, Wei
    Chen, Yunfang
    Zhao, Jingwen
    SECURITY AND PRIVACY IN COMMUNICATION NETWORKS, SECURECOMM 2018, PT I, 2018, 254 : 133 - 148
  • [8] Locally Private High-Dimensional Crowdsourced Data Release Based on Copula Functions
    Wang, Teng
    Yang, Xinyu
    Ren, Xuebin
    Yu, Wei
    Yang, Shusen
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (02) : 778 - 792
  • [9] Multimodal Data Fusion in High-Dimensional Heterogeneous Datasets Via Generative Models
    Yilmaz, Yasin
    Aktukmak, Mehmet
    Hero, Alfred
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 5175 - 5188
  • [10] A Privacy Preserving Similarity Search Scheme over Encrypted High-Dimensional Data for Multiple Data Owners
    Guo, Cheng
    Tian, Pengxu
    Jie, Yingmo
    Tang, Xinyu
    CLOUD COMPUTING AND SECURITY, PT II, 2018, 11064 : 484 - 495