Selective privacy-preserving framework for large language models fine-tuning

被引:0
|
作者
Wang, Teng [1 ]
Zhai, Lindong [1 ]
Yang, Tengfei [1 ]
Luo, Zhucheng [2 ]
Liu, Shuanggen [1 ]
机构
[1] Xian Univ Posts & Telecommun, Sch Cyberspace Secur, Xian 710121, Shaanxi, Peoples R China
[2] Sun Yat Sen Univ, Affiliated Hosp 3, Informat Ctr, Guangzhou 510630, Peoples R China
基金
中国国家自然科学基金;
关键词
Large language models; Fine-tuning; Local differential privacy; Selective privacy protection; DIFFERENTIAL PRIVACY;
D O I
10.1016/j.ins.2024.121000
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Fine-tuning pre -trained large language models (LLMs) helps various downstream tasks, but brings serious privacy leaks when relying on large amounts of data for training. Differentially private stochastic gradient descent (DPSGD) has been designed to introduce noise during model updates to prevent privacy leaks. Nevertheless, fine-tuning LLMs via DPSGD limits the model utility since heavy perturbations are introduced on large high -dimensional gradients. Besides, existing privacy -preserving mechanisms directly perturb all tokens of the input sentences, which are too pessimistic to achieve good model performance. Therefore, this paper researches a selective privacy -preserving framework for fine-tuning LLMs. We propose a first -of -its -kind privacy notion called selective sequence local differential privacy (S-SeqLDP), which provides guarantees of indistinguishability only for the secret part of the sequences. Furthermore, we design a novel framework called SLDP-FT that enables S-SeqLDP-compliant large language model fine-tuning by perturbing the forward -pass embeddings with selective noises. We innovatively investigate the privacy forward weight that determines the noise magnitude of achieving selective privacy protection. Extensive experiments on three tasks demonstrate that our SLDP-FT achieves better model accuracy than state-of-the-art techniques when providing the same level of privacy protection.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Fine-tuning language models to recognize semantic relations
    Dmitri Roussinov
    Serge Sharoff
    Nadezhda Puchnina
    Language Resources and Evaluation, 2023, 57 : 1463 - 1486
  • [42] Fine-Tuning Language Models with Just Forward Passes
    Malladi, Sadhika
    Gao, Tianyu
    Nichani, Eshaan
    Damian, Alex
    Lee, Jason D.
    Chen, Danqi
    Arora, Sanjeev
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [43] PPTIF: Privacy-Preserving Transformer Inference Framework for Language Translation
    Liu, Yanxin
    Su, Qianqian
    IEEE ACCESS, 2024, 12 : 48881 - 48897
  • [44] A Comparative Analysis of Instruction Fine-Tuning Large Language Models for Financial Text Classification
    Fatemi, Sorouralsadat
    Hu, Yuheng
    Mousavi, Maryam
    ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2025, 16 (01)
  • [45] LLM-MANUF: An integrated framework of Fine-Tuning large language models for intelligent Decision-Making in manufacturing
    Du, Kaze
    Yang, Bo
    Xie, Keqiang
    Dong, Nan
    Zhang, Zhengping
    Wang, Shilong
    Mo, Fan
    ADVANCED ENGINEERING INFORMATICS, 2025, 65
  • [46] How fine can fine-tuning be? Learning efficient language models
    Radiya-Dixit, Evani
    Wang, Xin
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2435 - 2442
  • [47] Prompt Engineering or Fine-Tuning? A Case Study on Phishing Detection with Large Language Models
    Trad, Fouad
    Chehab, Ali
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (01): : 367 - 384
  • [48] Efficient Fine-Tuning Large Language Models for Knowledge-Aware Response Planning
    Minh Nguyen
    Kishan, K. C.
    Toan Nguyen
    Chadha, Ankit
    Thuy Vu
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT II, 2023, 14170 : 593 - 611
  • [49] Leveraging error-assisted fine-tuning large language models for manufacturing excellence
    Xia, Liqiao
    Li, Chengxi
    Zhang, Canbin
    Liu, Shimin
    Zheng, Pai
    ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2024, 88
  • [50] Fine-Tuning Large Language Models for Radiation Oncology, A Specialized Health Care Domain
    Wang, P.
    Liu, Z.
    Li, Y.
    Holmes, J.
    Shu, P.
    Zhang, L.
    Li, X.
    Li, Q.
    Vora, S. A.
    Patel, S. H.
    Sio, T. T. W.
    Liu, T.
    Liu, W.
    INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2024, 120 (02): : E664 - E664