PPEM: Privacy-preserving EM learning for mixture models

被引:2
|
作者
Lee, Sharon X. [1 ]
Leemaqz, Kaleb L. [1 ]
McLachlan, Geoffrey J. [1 ]
机构
[1] Univ Queensland, Sch Math & Phys, Brisbane, Qld 4067, Australia
来源
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2019年 / 31卷 / 24期
关键词
corrupted party; EM algorithm; homomorphic encryption; malicious adversary; mixture model; privacy preserving data mining;
D O I
10.1002/cpe.5208
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Privacy is becoming increasingly important in collaborative data analysis, especially those involving personal or sensitive information commonly arising from health and commercial settings. The aim of privacy preserving statistical algorithms is to allow inference to be drawn on the joint data without disclosing private data held by each party. This paper presents a privacy-preserving expectation-maximization (PPEM) algorithm for carrying out maximum likelihood estimation of the parameters of mixture models. We address the scenario of horizontally partitioned data distributed among three or more parties. The PPEM algorithm is a two-cycle iterative distributed algorithm for fitting mixture models under privacy-preserving requirements. A distinct advantage of PPEM is that it does not require a trusted third party for cooperative learning, unlike most existing schemes that implement a master/slave hierarchy. By adopting a ring topology and adding random noises to messages before encryption, PPEM helps prevent information leakage in the case of corrupted parties. Furthermore, in contrast to existing works, which typically assume a Honest-but-Curious adversary, we consider the much stronger case of a Malicious adversary. For illustration, PPEM is applied to two of the most popular mixture models, namely, the normal mixture model (NMM) and t-mixture model (tMM), and their effectiveness is analyzed through a security analysis. A real data example is also presented to evaluate the computational complexity and accuracy of PPEM relative to its non-privacy-preserving version.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Privacy-preserving clustering with distributed EM mixture modeling
    Lin, XD
    Clifton, C
    Zhu, M
    KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 8 (01) : 68 - 81
  • [2] Privacy-preserving clustering with distributed EM mixture modeling
    Xiaodong Lin
    Chris Clifton
    Michael Zhu
    Knowledge and Information Systems, 2005, 8 : 68 - 81
  • [3] Privacy-Preserving Speaker Verification and Identification Using Gaussian Mixture Models
    Pathak, Manas A.
    Raj, Bhiksha
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (02): : 397 - 406
  • [4] Privacy-Preserving Deep Learning
    Shokri, Reza
    Shmatikov, Vitaly
    2015 53RD ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2015, : 909 - 910
  • [5] Privacy-Preserving Machine Learning
    Chow, Sherman S. M.
    FRONTIERS IN CYBER SECURITY, 2018, 879 : 3 - 6
  • [6] Privacy-Preserving Classifier Learning
    Brickell, Justin
    Shmatikov, Vitaly
    FINANCIAL CRYPTOGRAPHY AND DATA SECURITY, 2009, 5628 : 128 - 147
  • [7] Privacy-Preserving Deep Learning
    Shokri, Reza
    Shmatikov, Vitaly
    CCS'15: PROCEEDINGS OF THE 22ND ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2015, : 1310 - 1321
  • [8] Privacy-preserving culvert predictive models: A federated learning approach
    Mohammadi, Pouria
    Rashidi, Abbas
    Asgari, Sadegh
    ADVANCED ENGINEERING INFORMATICS, 2024, 61
  • [9] Privacy-Preserving Deep Learning NLP Models for Cancer Registries
    Alawad, Mohammed
    Yoon, Hong-Jun
    Gao, Shang
    Mumphrey, Brent
    Wu, Xiao-Cheng
    Durbin, Eric B.
    Jeong, Jong Cheol
    Hands, Isaac
    Rust, David
    Coyle, Linda
    Penberthy, Lynne
    Tourassi, Georgia
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (03) : 1219 - 1230
  • [10] Privacy-preserving Deep Learning Models for Law Big Data Feature Learning
    Yuan, Xu
    Zhang, Jianing
    Chen, Zhikui
    Gao, Jing
    Li, Peng
    IEEE 17TH INT CONF ON DEPENDABLE, AUTONOM AND SECURE COMP / IEEE 17TH INT CONF ON PERVAS INTELLIGENCE AND COMP / IEEE 5TH INT CONF ON CLOUD AND BIG DATA COMP / IEEE 4TH CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2019, : 128 - 134