Privacy-Preserving Machine Learning on Apache Spark

被引：1

作者：

Brito, Claudia V. ^{[1
,2
]}

Ferreira, Pedro G. ^{[1
,3
]}

Portela, Bernardo L. ^{[1
,3
]}

Oliveira, Rui C. ^{[1
,2
]}

Paulo, Joao T. ^{[1
,2
]}

机构：

[1] INESC TEC, P-4200465 Porto, Portugal

[2] Univ Minho, Dept Informat, P-4710057 Braga, Portugal

[3] Univ Porto, Fac Sci, P-4099002 Porto, Portugal

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Cluster computing; Training; Machine learning; Hardware; Task analysis; Homomorphic encryption; Distributed computing; Trusted computing; Privacy-preserving; machine learning; distributed systems; apache spark; trusted execution environments; Intel SGX; SECURITY; ATTACKS;

D O I：

10.1109/ACCESS.2023.3332222

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The adoption of third-party machine learning (ML) cloud services is highly dependent on the security guarantees and the performance penalty they incur on workloads for model training and inference. This paper explores security/performance trade-offs for the distributed Apache Spark framework and its ML library. Concretely, we build upon a key insight: in specific deployment settings, one can reveal carefully chosen non-sensitive operations (e.g. statistical calculations). This allows us to considerably improve the performance of privacy-preserving solutions without exposing the protocol to pervasive ML attacks. In more detail, we propose Soteria, a system for distributed privacy-preserving ML that leverages Trusted Execution Environments (e.g. Intel SGX) to run computations over sensitive information in isolated containers (enclaves). Unlike previous work, where all ML-related computation is performed at trusted enclaves, we introduce a hybrid scheme, combining computation done inside and outside these enclaves. The experimental evaluation validates that our approach reduces the runtime of ML algorithms by up to 41% when compared to previous related work. Our protocol is accompanied by a security proof and a discussion regarding resilience against a wide spectrum of ML attacks.

引用

页码：127907 / 127930

页数：24

共 50 条

[21] Privacy-Preserving Machine Learning Using EtC Images
Kawamura, Ayana
Kinoshita, Yuma
Kiya, Hitoshi
[J]. INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2020, 2020, 11515
[22] Privacy-Preserving Distributed Machine Learning Made Faster
Jiang, Zoe L.
Gu, Jiajing
Wang, Hongxiao
Wu, Yulin
Fang, Junbin
Yiu, Siu-Ming
Luo, Wenjian
Wang, Xuan
[J]. PROCEEDINGS OF THE INAUGURAL ASIACCS 2023 WORKSHOP ON SECURE AND TRUSTWORTHY DEEP LEARNING SYSTEMS, SECTL, 2022,
[23] SecureML: A System for Scalable Privacy-Preserving Machine Learning
Mohassel, Payman
Zhang, Yupeng
[J]. 2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, : 19 - 38
[24] Re-visited Privacy-Preserving Machine Learning
Miyaji, Atsuko
Yamatsuki, Tatsuhiro
He, Bingchang
Yamashita, Shintaro
Mimoto, Tomoaki
[J]. 2023 20TH ANNUAL INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY AND TRUST, PST, 2023, : 298 - 307
[25] A Distributed Trust Framework for Privacy-Preserving Machine Learning
Abramson, Will
Hall, Adam James
Papadopoulos, Pavlos
Pitropakis, Nikolaos
Buchanan, William J.
[J]. TRUST, PRIVACY AND SECURITY IN DIGITAL BUSINESS, TRUSTBUS 2020, 2020, 12395 : 205 - 220
[26] Cryptographic Primitives in Privacy-Preserving Machine Learning: A Survey
Qin, Hong
He, Debiao
Feng, Qi
Khan, Muhammad Khurram
Luo, Min
Choo, Kim-Kwang Raymond
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (05) : 1919 - 1934
[27] Privacy-preserving machine learning with multiple data providers
Li, Ping
Li, Tong
Ye, Heng
Li, Jin
Chen, Xiaofeng
Xiang, Yang
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 87 : 341 - 350
[28] GENoPPML - a framework for genomic privacy-preserving machine learning
Carpov, Sergiu
Gama, Nicolas
Georgieva, Mariya
Jetchev, Dimitar
[J]. 2022 IEEE 15TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (IEEE CLOUD 2022), 2022, : 532 - 542
[29] Privacy-Preserving Machine Learning as a Service: Challenges and Opportunities
Zhang, Qiao
Xiang, Tao
Cai, Yifei
Zhao, Zhichao
Wang, Ning
Wu, Hongyi
[J]. IEEE NETWORK, 2023, 37 (06): : 214 - 223
[30] Learning in the Dark: Privacy-Preserving Machine Learning using Function Approximation
Khan, Tanveer
Michalas, Antonis
[J]. 2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 62 - 71

← 1 2 3 4 5 →