Vulnerability Analysis of Continuous Prompts for Pre-trained Language Models

被引：0

作者：

Li, Zhicheng ^{[1
]}

Shi, Yundi ^{[1
]}

Sheng, Xuan ^{[1
]}

Yin, Changchun ^{[1
]}

Zhou, Lu ^{[1
]}

Li, Piji ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Jiangsu, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT IX | 2023年 / 14262卷

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Prompt-based Learning; Adversarial Attack; Pretrained Language Models;

D O I：

10.1007/978-3-031-44201-8_41

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Prompt-based learning has recently emerged as a promising approach for handling the increasing complexity of downstream natural language processing (NLP) tasks, achieving state-of-the-art performance without using hundreds of billions of parameters. However, this paper investigates the general vulnerability of continuous prompt-based learning in NLP tasks, and uncovers an important problem: the predictions of continuous prompt-based models can be easily misled by noise perturbations. To address this issue, we propose a learnable attack approach that generates noise perturbations with the goal of minimizing their L-2-norm in order to attack the primitive, harmless successive prompts in a way that researchers may not be aware of. Our approach introduces a new loss function that generates small and impactful perturbations for each different continuous prompt. Even more, our approach shows that learnable attack perturbations with an L-2-norm close to zero can severely degrade the performance of continuous prompt-based models on downstream tasks. We evaluate the performance of our learnable attack approach against two continuous prompt-based models on three benchmark datasets and the results demonstrate that the noise and learnable attack methods can effectively attack continuous prompts, with some tasks exhibiting an F1-score close to 0.

引用

页码：508 / 519

页数：12

共 50 条

[41] Pre-trained language models for keyphrase prediction: A review
Umair, Muhammad
Sultana, Tangina
Lee, Young-Koo
[J]. ICT EXPRESS, 2024, 10 (04): : 871 - 890
[42] Pre-trained models for natural language processing: A survey
QIU XiPeng
SUN TianXiang
XU YiGe
SHAO YunFan
DAI Ning
HUANG XuanJing
[J]. Science China(Technological Sciences), 2020, (10) : 1872 - 1897
[43] Evaluating and Inducing Personality in Pre-trained Language Models
Jiang, Guangyuan
Xu, Manjie
Zhu, Song-Chun
Han, Wenjuan
Zhang, Chi
Zhu, Yixin
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[44] Evaluating the Summarization Comprehension of Pre-Trained Language Models
D. I. Chernyshev
B. V. Dobrov
[J]. Lobachevskii Journal of Mathematics, 2023, 44 : 3028 - 3039
[45] Pre-trained models for natural language processing: A survey
XiPeng Qiu
TianXiang Sun
YiGe Xu
YunFan Shao
Ning Dai
XuanJing Huang
[J]. Science China Technological Sciences, 2020, 63 : 1872 - 1897
[46] Robust Lottery Tickets for Pre-trained Language Models
Zheng, Rui
Bao, Rong
Zhou, Yuhao
Liang, Di
Wane, Sirui
Wu, Wei
Gui, Tao
Zhang, Qi
Huang, Xuanjing
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2211 - 2224
[47] Pre-Trained Language Models for Text Generation: A Survey
Li, Junyi
Tang, Tianyi
Zhao, Wayne Xin
Nie, Jian-Yun
Wen, Ji-Rong
[J]. ACM COMPUTING SURVEYS, 2024, 56 (09)
[48] Leveraging pre-trained language models for code generation
Soliman, Ahmed
Shaheen, Samir
Hadhoud, Mayada
[J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (03) : 3955 - 3980
[49] Modeling Second Language Acquisition with pre-trained neural language models
Palenzuela, Alvaro J. Jimenez
Frasincar, Flavius
Trusca, Maria Mihaela
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 207
[50] μBERT: Mutation Testing using Pre-Trained Language Models
Degiovanni, Renzo
Papadakis, Mike
[J]. 2022 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS (ICSTW 2022), 2022, : 160 - 169

← 1 2 3 4 5 →