Data Stealing Attacks against Large Language Models via Backdooring

被引：0

作者：

He, Jiaming ^{[1
]}

Hou, Guanyu ^{[1
]}

Jia, Xinyue ^{[1
]}

Chen, Yangyang ^{[1
]}

Liao, Wenqi ^{[1
]}

Zhou, Yinhang ^{[2
]}

Zhou, Rang ^{[1
]}

机构：

[1] Chengdu Univ Technol, Coll Comp Sci & Cyber Secur, Oxford Brookes Coll, Chengdu 610059, Peoples R China

[2] Shenyang Normal Univ, Software Coll, Shenyang 110034, Peoples R China

来源：

ELECTRONICS | 2024年 / 13卷 / 14期

关键词：

data privacy; large language models; stealing attacks;

D O I：

10.3390/electronics13142858

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Large language models (LLMs) have gained immense attention and are being increasingly applied in various domains. However, this technological leap forward poses serious security and privacy concerns. This paper explores a novel approach to data stealing attacks by introducing an adaptive method to extract private training data from pre-trained LLMs via backdooring. Our method mainly focuses on the scenario of model customization and is conducted in two phases, including backdoor training and backdoor activation, which allow for the extraction of private information without prior knowledge of the model's architecture or training data. During the model customization stage, attackers inject the backdoor into the pre-trained LLM by poisoning a small ratio of the training dataset. During the inference stage, attackers can extract private information from the third-party knowledge database by incorporating the pre-defined backdoor trigger. Our method leverages the customization process of LLMs, injecting a stealthy backdoor that can be triggered after deployment to retrieve private data. We demonstrate the effectiveness of our proposed attack through extensive experiments, achieving a notable attack success rate. Extensive experiments demonstrate the effectiveness of our stealing attack in popular LLM architectures, as well as stealthiness during normal inference.

引用

页数：19

共 50 条

[31] Spectrum Stealing via Sybil Attacks in DSA Networks: Implementation and Defense
Tan, Yi
Hong, Kai
Sengupta, Shamik
Subbalakshmi, K. P.
2011 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2011,
[32] Trend Extraction and Analysis via Large Language Models
Soru, Tommaso
Marshall, Jim
18TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC 2024, 2024, : 285 - 288
[33] Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning
Ozdayi, Mustafa Safa
Peris, Charith
Fitzgerald, Jack
Dupuy, Christophe
Majmudar, Jimit
Khan, Haidar
Parikh, Rahil
Gupta, Rahul
61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1512 - 1521
[34] ChatTwin: Toward Automated Digital Twin Generation for Data Center via Large Language Models
Li, Minghao
Wang, Ruihang
Zhou, Xin
Zhu, Zhaomeng
Wen, Yonggang
Tan, Rui
PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILDINGS, CITIES, AND TRANSPORTATION, BUILDSYS 2023, 2023, : 208 - 211
[35] CommonIT: Commonality-Aware Instruction Tuning for Large Language Models via Data Partitions
Rao, Jun
Liu, Xuebo
Lian, Lian
Cheng, Shengjun
Liao, Yunjie
Zhang, Min
EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 2024, : 10064 - 10083
[36] Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges
Kumar, Pranjal
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2024, 13 (03)
[37] Targeted Training Data Extraction-Neighborhood Comparison-Based Membership Inference Attacks in Large Language Models
Xu, Huan
Zhang, Zhanhao
Yu, Xiaodong
Wu, Yingbo
Zha, Zhiyong
Xu, Bo
Xu, Wenfeng
Hu, Menglan
Peng, Kai
APPLIED SCIENCES-BASEL, 2024, 14 (16):
[38] A measurement method for intrusion detection in cyber IoT data stealing attacks
Amodei, A.
Capriglione, D.
Ferrigno, L.
Miele, G.
Tomasso, G.
Cerro, G.
2023 IEEE INTERNATIONAL INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE, I2MTC, 2023,
[39] Extracting Training Data from Large Language Models
Carlini, Nicholas
Tramer, Florian
Wallace, Eric
Jagielski, Matthew
Herbert-Voss, Ariel
Lee, Katherine
Roberts, Adam
Brown, Tom
Song, Dawn
Erlingsson, Ulfar
Oprea, Alina
Raffel, Colin
PROCEEDINGS OF THE 30TH USENIX SECURITY SYMPOSIUM, 2021, : 2633 - 2650
[40] Leveraging Large Language Models for Sensor Data Retrieval
Berenguer, Alberto
Morejon, Adriana
Tomas, David
Mazon, Jose-Norberto
APPLIED SCIENCES-BASEL, 2024, 14 (06):

← 1 2 3 4 5 →