Vulnerability Analysis of Continuous Prompts for Pre-trained Language Models

被引：0

作者：

Li, Zhicheng ^{[1
]}

Shi, Yundi ^{[1
]}

Sheng, Xuan ^{[1
]}

Yin, Changchun ^{[1
]}

Zhou, Lu ^{[1
]}

Li, Piji ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Jiangsu, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT IX | 2023年 / 14262卷

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Prompt-based Learning; Adversarial Attack; Pretrained Language Models;

D O I：

10.1007/978-3-031-44201-8_41

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Prompt-based learning has recently emerged as a promising approach for handling the increasing complexity of downstream natural language processing (NLP) tasks, achieving state-of-the-art performance without using hundreds of billions of parameters. However, this paper investigates the general vulnerability of continuous prompt-based learning in NLP tasks, and uncovers an important problem: the predictions of continuous prompt-based models can be easily misled by noise perturbations. To address this issue, we propose a learnable attack approach that generates noise perturbations with the goal of minimizing their L-2-norm in order to attack the primitive, harmless successive prompts in a way that researchers may not be aware of. Our approach introduces a new loss function that generates small and impactful perturbations for each different continuous prompt. Even more, our approach shows that learnable attack perturbations with an L-2-norm close to zero can severely degrade the performance of continuous prompt-based models on downstream tasks. We evaluate the performance of our learnable attack approach against two continuous prompt-based models on three benchmark datasets and the results demonstrate that the noise and learnable attack methods can effectively attack continuous prompts, with some tasks exhibiting an F1-score close to 0.

引用

页码：508 / 519

页数：12

共 50 条

[21] Emotional Paraphrasing Using Pre-trained Language Models
Casas, Jacky
Torche, Samuel
Daher, Karl
Mugellini, Elena
Abou Khaled, Omar
[J]. 2021 9TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2021,
[22] Prompt Tuning for Discriminative Pre-trained Language Models
Yao, Yuan
Dong, Bowen
Zhang, Ao
Zhang, Zhengyan
Xie, Ruobing
Liu, Zhiyuan
Lin, Leyu
Sun, Maosong
Wang, Jianyong
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3468 - 3473
[23] Dynamic Knowledge Distillation for Pre-trained Language Models
Li, Lei
Lin, Yankai
Ren, Shuhuai
Li, Peng
Zhou, Jie
Sun, Xu
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 379 - 389
[24] Impact of Morphological Segmentation on Pre-trained Language Models
Westhelle, Matheus
Bencke, Luciana
Moreira, Viviane P.
[J]. INTELLIGENT SYSTEMS, PT II, 2022, 13654 : 402 - 416
[25] Leveraging Pre-trained Language Models for Gender Debiasing
Jain, Nishtha
Popovic, Maja
Groves, Declan
Specia, Lucia
[J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2188 - 2195
[26] InA: Inhibition Adaption on pre-trained language models
Kang, Cheng
Prokop, Jindrich
Tong, Lei
Zhou, Huiyu
Hu, Yong
Novak, Daniel
[J]. NEURAL NETWORKS, 2024, 178
[27] A Close Look into the Calibration of Pre-trained Language Models
Chen, Yangyi
Yuan, Lifan
Cui, Ganqu
Liu, Zhiyuan
Ji, Heng
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1343 - 1367
[28] Deep Entity Matching with Pre-Trained Language Models
Li, Yuliang
Li, Jinfeng
Suhara, Yoshihiko
Doan, AnHai
Tan, Wang-Chiew
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (01): : 50 - 60
[29] Self-conditioning Pre-Trained Language Models
Suau, Xavier
Zappella, Luca
Apostoloff, Nicholas
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[30] A Survey of Knowledge Enhanced Pre-Trained Language Models
Hu, Linmei
Liu, Zeyi
Zhao, Ziwang
Hou, Lei
Nie, Liqiang
Li, Juanzi
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1413 - 1430

← 1 2 3 4 5 →