Data Stealing Attacks against Large Language Models via Backdooring

被引:0
|
作者
He, Jiaming [1 ]
Hou, Guanyu [1 ]
Jia, Xinyue [1 ]
Chen, Yangyang [1 ]
Liao, Wenqi [1 ]
Zhou, Yinhang [2 ]
Zhou, Rang [1 ]
机构
[1] Chengdu Univ Technol, Coll Comp Sci & Cyber Secur, Oxford Brookes Coll, Chengdu 610059, Peoples R China
[2] Shenyang Normal Univ, Software Coll, Shenyang 110034, Peoples R China
关键词
data privacy; large language models; stealing attacks;
D O I
10.3390/electronics13142858
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large language models (LLMs) have gained immense attention and are being increasingly applied in various domains. However, this technological leap forward poses serious security and privacy concerns. This paper explores a novel approach to data stealing attacks by introducing an adaptive method to extract private training data from pre-trained LLMs via backdooring. Our method mainly focuses on the scenario of model customization and is conducted in two phases, including backdoor training and backdoor activation, which allow for the extraction of private information without prior knowledge of the model's architecture or training data. During the model customization stage, attackers inject the backdoor into the pre-trained LLM by poisoning a small ratio of the training dataset. During the inference stage, attackers can extract private information from the third-party knowledge database by incorporating the pre-defined backdoor trigger. Our method leverages the customization process of LLMs, injecting a stealthy backdoor that can be triggered after deployment to retrieve private data. We demonstrate the effectiveness of our proposed attack through extensive experiments, achieving a notable attack success rate. Extensive experiments demonstrate the effectiveness of our stealing attack in popular LLM architectures, as well as stealthiness during normal inference.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] Spectrum Stealing via Sybil Attacks in DSA Networks: Implementation and Defense
    Tan, Yi
    Hong, Kai
    Sengupta, Shamik
    Subbalakshmi, K. P.
    2011 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2011,
  • [32] Trend Extraction and Analysis via Large Language Models
    Soru, Tommaso
    Marshall, Jim
    18TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC 2024, 2024, : 285 - 288
  • [33] Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning
    Ozdayi, Mustafa Safa
    Peris, Charith
    Fitzgerald, Jack
    Dupuy, Christophe
    Majmudar, Jimit
    Khan, Haidar
    Parikh, Rahil
    Gupta, Rahul
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1512 - 1521
  • [34] ChatTwin: Toward Automated Digital Twin Generation for Data Center via Large Language Models
    Li, Minghao
    Wang, Ruihang
    Zhou, Xin
    Zhu, Zhaomeng
    Wen, Yonggang
    Tan, Rui
    PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILDINGS, CITIES, AND TRANSPORTATION, BUILDSYS 2023, 2023, : 208 - 211
  • [35] CommonIT: Commonality-Aware Instruction Tuning for Large Language Models via Data Partitions
    Rao, Jun
    Liu, Xuebo
    Lian, Lian
    Cheng, Shengjun
    Liao, Yunjie
    Zhang, Min
    EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 2024, : 10064 - 10083
  • [36] Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges
    Kumar, Pranjal
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2024, 13 (03)
  • [37] Targeted Training Data Extraction-Neighborhood Comparison-Based Membership Inference Attacks in Large Language Models
    Xu, Huan
    Zhang, Zhanhao
    Yu, Xiaodong
    Wu, Yingbo
    Zha, Zhiyong
    Xu, Bo
    Xu, Wenfeng
    Hu, Menglan
    Peng, Kai
    APPLIED SCIENCES-BASEL, 2024, 14 (16):
  • [38] A measurement method for intrusion detection in cyber IoT data stealing attacks
    Amodei, A.
    Capriglione, D.
    Ferrigno, L.
    Miele, G.
    Tomasso, G.
    Cerro, G.
    2023 IEEE INTERNATIONAL INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE, I2MTC, 2023,
  • [39] Extracting Training Data from Large Language Models
    Carlini, Nicholas
    Tramer, Florian
    Wallace, Eric
    Jagielski, Matthew
    Herbert-Voss, Ariel
    Lee, Katherine
    Roberts, Adam
    Brown, Tom
    Song, Dawn
    Erlingsson, Ulfar
    Oprea, Alina
    Raffel, Colin
    PROCEEDINGS OF THE 30TH USENIX SECURITY SYMPOSIUM, 2021, : 2633 - 2650
  • [40] Leveraging Large Language Models for Sensor Data Retrieval
    Berenguer, Alberto
    Morejon, Adriana
    Tomas, David
    Mazon, Jose-Norberto
    APPLIED SCIENCES-BASEL, 2024, 14 (06):