Exploring Lottery Prompts for Pre-trained Language Models

被引：0

作者：

Chen, Yulin ^{[1
]}

Ding, Ning ^{[1
,2
]}

Wang, Xiaobin ^{[3
]}

Hu, Shengding ^{[2
]}

Zheng, Hai-Tao ^{[1
,4
]}

Liu, Zhiyuan ^{[2
,5
,6
]}

Xie, Pengjun ^{[3
]}

机构：

[1] Tsinghua Univ, Shenzhen Int Grad Sch, Beijing, Peoples R China

[2] Tsinghua Univ, DCST, Beijing, Peoples R China

[3] Alibaba Grp, Hangzhou, Peoples R China

[4] Pengcheng Lab, Shenzhen, Peoples R China

[5] Tsinghua Univ, IAI, BNRIST, Beijing, Peoples R China

[6] Jiangsu Normal Univ, Jiangsu Collaborat Innovat Ctr Language Abil, Xuzhou, Jiangsu, Peoples R China

来源：

PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1 | 2023年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Consistently scaling pre-trained language models (PLMs) imposes substantial burdens on model adaptation, necessitating more efficient alternatives to conventional fine-tuning. Given the advantage of prompting in the zero-shot setting and the observed performance fluctuation among different prompts, we explore the instance-level prompt and their generalizability. By searching through the prompt space, we first validate the assumption that for every instance, there is almost always a lottery prompt that induces the correct prediction from the PLM, and such prompt can be obtained at a low cost thanks to the inherent ability of PLMs. Meanwhile, we find that some strong lottery prompts have high performance over the whole training set, and they are equipped with distinguishable linguistic features. Lastly, we attempt to generalize the searched strong lottery prompts to unseen data with prompt ensembling method without any parameter tuning. Experiments are conducted on various types of NLP classification tasks and demonstrate that the proposed method can achieve comparable results with other gradient-free and optimization-free baselines.

引用

页码：15428 / 15444

页数：17

共 50 条

[1] Robust Lottery Tickets for Pre-trained Language Models
Zheng, Rui
Bao, Rong
Zhou, Yuhao
Liang, Di
Wane, Sirui
Wu, Wei
Gui, Tao
Zhang, Qi
Huang, Xuanjing
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2211 - 2224
[2] Vulnerability Analysis of Continuous Prompts for Pre-trained Language Models
Li, Zhicheng
Shi, Yundi
Sheng, Xuan
Yin, Changchun
Zhou, Lu
Li, Piji
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT IX, 2023, 14262 : 508 - 519
[3] Lottery Jackpots Exist in Pre-Trained Models
Zhang, Yuxin
Lin, Mingbao
Zhong, Yunshan
Chao, Fei
Ji, Rongrong
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 14990 - 15004
[4] Fusing Pre-trained Language Models with Multimodal Prompts through Reinforcement Learning
Yu, Youngjae
Chung, Jiwan
Yun, Heeseung
Hessel, Jack
Park, Jae Sung
Lu, Ximing
Zellers, Rowan
Ammanabrolu, Prithviraj
Le Bras, Ronan
Kim, Gunhee
Choi, Yejin
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10845 - 10856
[5] Exploring Pre-trained Language Models for Event Extraction and Generation
Yang, Sen
Feng, Dawei
Qiao, Linbo
Kan, Zhigang
Li, Dongsheng
[J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5284 - 5294
[6] Exploring Pre-trained Language Models for Vocabulary Alignment in the UMLS
Hao, Xubing
Abeysinghe, Rashmie
Shi, Jay
Cui, Licong
[J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, PT I, AIME 2024, 2024, 14844 : 273 - 278
[7] Pre-Trained Language Models and Their Applications
Wang, Haifeng
Li, Jiwei
Wu, Hua
Hovy, Eduard
Sun, Yu
[J]. ENGINEERING, 2023, 25 (51-65): : 51 - 65
[8] Pre-trained Language Model with Prompts for Temporal Knowledge Graph Completion
Xu, Wenjie
Liu, Ben
Peng, Miao
Jia, Xu
Peng, Min
[J]. arXiv, 2023,
[9] Annotating Columns with Pre-trained Language Models
Suhara, Yoshihiko
Li, Jinfeng
Li, Yuliang
Zhang, Dan
Demiralp, Cagatay
Chen, Chen
Tan, Wang-Chiew
[J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 1493 - 1503
[10] LaoPLM: Pre-trained Language Models for Lao
Lin, Nankai
Fu, Yingwen
Yang, Ziyu
Chen, Chuwei
Jiang, Shengyi
[J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6506 - 6512

← 1 2 3 4 5 →