Vulnerability Analysis of Continuous Prompts for Pre-trained Language Models

被引:0
|
作者
Li, Zhicheng [1 ]
Shi, Yundi [1 ]
Sheng, Xuan [1 ]
Yin, Changchun [1 ]
Zhou, Lu [1 ]
Li, Piji [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Jiangsu, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Prompt-based Learning; Adversarial Attack; Pretrained Language Models;
D O I
10.1007/978-3-031-44201-8_41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prompt-based learning has recently emerged as a promising approach for handling the increasing complexity of downstream natural language processing (NLP) tasks, achieving state-of-the-art performance without using hundreds of billions of parameters. However, this paper investigates the general vulnerability of continuous prompt-based learning in NLP tasks, and uncovers an important problem: the predictions of continuous prompt-based models can be easily misled by noise perturbations. To address this issue, we propose a learnable attack approach that generates noise perturbations with the goal of minimizing their L-2-norm in order to attack the primitive, harmless successive prompts in a way that researchers may not be aware of. Our approach introduces a new loss function that generates small and impactful perturbations for each different continuous prompt. Even more, our approach shows that learnable attack perturbations with an L-2-norm close to zero can severely degrade the performance of continuous prompt-based models on downstream tasks. We evaluate the performance of our learnable attack approach against two continuous prompt-based models on three benchmark datasets and the results demonstrate that the noise and learnable attack methods can effectively attack continuous prompts, with some tasks exhibiting an F1-score close to 0.
引用
收藏
页码:508 / 519
页数:12
相关论文
共 50 条
  • [41] Pre-trained language models for keyphrase prediction: A review
    Umair, Muhammad
    Sultana, Tangina
    Lee, Young-Koo
    [J]. ICT EXPRESS, 2024, 10 (04): : 871 - 890
  • [42] Pre-trained models for natural language processing: A survey
    QIU XiPeng
    SUN TianXiang
    XU YiGe
    SHAO YunFan
    DAI Ning
    HUANG XuanJing
    [J]. Science China(Technological Sciences), 2020, (10) : 1872 - 1897
  • [43] Evaluating and Inducing Personality in Pre-trained Language Models
    Jiang, Guangyuan
    Xu, Manjie
    Zhu, Song-Chun
    Han, Wenjuan
    Zhang, Chi
    Zhu, Yixin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [44] Evaluating the Summarization Comprehension of Pre-Trained Language Models
    D. I. Chernyshev
    B. V. Dobrov
    [J]. Lobachevskii Journal of Mathematics, 2023, 44 : 3028 - 3039
  • [45] Pre-trained models for natural language processing: A survey
    XiPeng Qiu
    TianXiang Sun
    YiGe Xu
    YunFan Shao
    Ning Dai
    XuanJing Huang
    [J]. Science China Technological Sciences, 2020, 63 : 1872 - 1897
  • [46] Robust Lottery Tickets for Pre-trained Language Models
    Zheng, Rui
    Bao, Rong
    Zhou, Yuhao
    Liang, Di
    Wane, Sirui
    Wu, Wei
    Gui, Tao
    Zhang, Qi
    Huang, Xuanjing
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2211 - 2224
  • [47] Pre-Trained Language Models for Text Generation: A Survey
    Li, Junyi
    Tang, Tianyi
    Zhao, Wayne Xin
    Nie, Jian-Yun
    Wen, Ji-Rong
    [J]. ACM COMPUTING SURVEYS, 2024, 56 (09)
  • [48] Leveraging pre-trained language models for code generation
    Soliman, Ahmed
    Shaheen, Samir
    Hadhoud, Mayada
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (03) : 3955 - 3980
  • [49] Modeling Second Language Acquisition with pre-trained neural language models
    Palenzuela, Alvaro J. Jimenez
    Frasincar, Flavius
    Trusca, Maria Mihaela
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 207
  • [50] μBERT: Mutation Testing using Pre-Trained Language Models
    Degiovanni, Renzo
    Papadakis, Mike
    [J]. 2022 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS (ICSTW 2022), 2022, : 160 - 169