Data Stealing Attacks against Large Language Models via Backdooring

被引:0
|
作者
He, Jiaming [1 ]
Hou, Guanyu [1 ]
Jia, Xinyue [1 ]
Chen, Yangyang [1 ]
Liao, Wenqi [1 ]
Zhou, Yinhang [2 ]
Zhou, Rang [1 ]
机构
[1] Chengdu Univ Technol, Coll Comp Sci & Cyber Secur, Oxford Brookes Coll, Chengdu 610059, Peoples R China
[2] Shenyang Normal Univ, Software Coll, Shenyang 110034, Peoples R China
关键词
data privacy; large language models; stealing attacks;
D O I
10.3390/electronics13142858
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large language models (LLMs) have gained immense attention and are being increasingly applied in various domains. However, this technological leap forward poses serious security and privacy concerns. This paper explores a novel approach to data stealing attacks by introducing an adaptive method to extract private training data from pre-trained LLMs via backdooring. Our method mainly focuses on the scenario of model customization and is conducted in two phases, including backdoor training and backdoor activation, which allow for the extraction of private information without prior knowledge of the model's architecture or training data. During the model customization stage, attackers inject the backdoor into the pre-trained LLM by poisoning a small ratio of the training dataset. During the inference stage, attackers can extract private information from the third-party knowledge database by incorporating the pre-defined backdoor trigger. Our method leverages the customization process of LLMs, injecting a stealthy backdoor that can be triggered after deployment to retrieve private data. We demonstrate the effectiveness of our proposed attack through extensive experiments, achieving a notable attack success rate. Extensive experiments demonstrate the effectiveness of our stealing attack in popular LLM architectures, as well as stealthiness during normal inference.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] How Large Language Models Will Disrupt Data Management
    Fernandez, Raul Castro
    Elmore, Aaron J.
    Franklin, Michael J.
    Krishnan, Sanjay
    Tan, Chenhao
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (11): : 3302 - 3309
  • [42] Data Augmentation for Spoken Language Understanding via Pretrained Language Models
    Peng, Baolin
    Zhu, Chenguang
    Zeng, Michael
    Gao, Jianfeng
    INTERSPEECH 2021, 2021, : 1219 - 1223
  • [43] Potential of Large Language Models as Tools Against Medical Disinformation
    Zhu, Lingxuan
    Mou, Weiming
    Luo, Peng
    JAMA INTERNAL MEDICINE, 2024, 184 (04)
  • [44] Stealing Machine Learning Models via Prediction APIs
    Tramer, Florian
    Zhang, Fan
    Juels, Ari
    Reiter, Michael K.
    Ristenpart, Thomas
    PROCEEDINGS OF THE 25TH USENIX SECURITY SYMPOSIUM, 2016, : 601 - 618
  • [45] Data Selection for Language Models via Importance Resampling
    Xie, Sang Michael
    Santurkar, Shibani
    Ma, Tengyu
    Liang, Percy
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [46] A Plot is Worth a ThousandWords: Model Information Stealing Attacks via Scientific Plots
    Zhang, Boyang
    He, Xinlei
    Shen, Yun
    Wang, Tianhao
    Zhang, Yang
    PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 5289 - 5306
  • [47] Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models
    Deng, Yinlin
    Xia, Chunqiu Steven
    Peng, Haoran
    Yang, Chenyuan
    Zhan, Lingming
    PROCEEDINGS OF THE 32ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2023, 2023, : 423 - 435
  • [48] Membership Inference Attacks Against Deep Learning Models via Logits Distribution
    Yan, Hongyang
    Li, Shuhao
    Wang, Yajie
    Zhang, Yaoyuan
    Sharif, Kashif
    Hu, Haibo
    Li, Yuanzhang
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2023, 20 (05) : 3799 - 3808
  • [49] Securing Vision-Language Models with a Robust Encoder Against Jailbreak and Adversarial Attacks
    Hossain, Md Zarif
    Imteaj, Ahmed
    Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024, 2024, : 6250 - 6259
  • [50] Membership Inference Attacks Against Machine Learning Models via Prediction Sensitivity
    Liu, Lan
    Wang, Yi
    Liu, Gaoyang
    Peng, Kai
    Wang, Chen
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2023, 20 (03) : 2341 - 2347