Exploring Lottery Prompts for Pre-trained Language Models

被引：0

作者：

Chen, Yulin ^{[1
]}

Ding, Ning ^{[1
,2
]}

Wang, Xiaobin ^{[3
]}

Hu, Shengding ^{[2
]}

Zheng, Hai-Tao ^{[1
,4
]}

Liu, Zhiyuan ^{[2
,5
,6
]}

Xie, Pengjun ^{[3
]}

机构：

[1] Tsinghua Univ, Shenzhen Int Grad Sch, Beijing, Peoples R China

[2] Tsinghua Univ, DCST, Beijing, Peoples R China

[3] Alibaba Grp, Hangzhou, Peoples R China

[4] Pengcheng Lab, Shenzhen, Peoples R China

[5] Tsinghua Univ, IAI, BNRIST, Beijing, Peoples R China

[6] Jiangsu Normal Univ, Jiangsu Collaborat Innovat Ctr Language Abil, Xuzhou, Jiangsu, Peoples R China

来源：

PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1 | 2023年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Consistently scaling pre-trained language models (PLMs) imposes substantial burdens on model adaptation, necessitating more efficient alternatives to conventional fine-tuning. Given the advantage of prompting in the zero-shot setting and the observed performance fluctuation among different prompts, we explore the instance-level prompt and their generalizability. By searching through the prompt space, we first validate the assumption that for every instance, there is almost always a lottery prompt that induces the correct prediction from the PLM, and such prompt can be obtained at a low cost thanks to the inherent ability of PLMs. Meanwhile, we find that some strong lottery prompts have high performance over the whole training set, and they are equipped with distinguishable linguistic features. Lastly, we attempt to generalize the searched strong lottery prompts to unseen data with prompt ensembling method without any parameter tuning. Experiments are conducted on various types of NLP classification tasks and demonstrate that the proposed method can achieve comparable results with other gradient-free and optimization-free baselines.

引用

页码：15428 / 15444

页数：17

共 50 条

[31] A Close Look into the Calibration of Pre-trained Language Models
Chen, Yangyi
Yuan, Lifan
Cui, Ganqu
Liu, Zhiyuan
Ji, Heng
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1343 - 1367
[32] Deep Entity Matching with Pre-Trained Language Models
Li, Yuliang
Li, Jinfeng
Suhara, Yoshihiko
Doan, AnHai
Tan, Wang-Chiew
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (01): : 50 - 60
[33] Self-conditioning Pre-Trained Language Models
Suau, Xavier
Zappella, Luca
Apostoloff, Nicholas
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[34] A Survey of Knowledge Enhanced Pre-Trained Language Models
Hu, Linmei
Liu, Zeyi
Zhao, Ziwang
Hou, Lei
Nie, Liqiang
Li, Juanzi
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1413 - 1430
[35] Context Analysis for Pre-trained Masked Language Models
Lai, Yi-An
Lalwani, Garima
Zhang, Yi
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 3789 - 3804
[36] Evaluating the Summarization Comprehension of Pre-Trained Language Models
Chernyshev, D. I.
Dobrov, B. V.
[J]. LOBACHEVSKII JOURNAL OF MATHEMATICS, 2023, 44 (08) : 3028 - 3039
[37] Pre-trained models for natural language processing: A survey
QIU XiPeng
SUN TianXiang
XU YiGe
SHAO YunFan
DAI Ning
HUANG XuanJing
[J]. Science China Technological Sciences, 2020, 63 (10) : 1872 - 1897
[38] Empowering News Recommendation with Pre-trained Language Models
Wu, Chuhan
Wu, Fangzhao
Qi, Tao
Huang, Yongfeng
[J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1652 - 1656
[39] Pre-trained language models: What do they know?
Guimaraes, Nuno
Campos, Ricardo
Jorge, Alipio
[J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 14 (01)
[40] Understanding Online Attitudes with Pre-Trained Language Models
Power, William
Obradovic, Zoran
[J]. PROCEEDINGS OF THE 2023 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2023, 2023, : 745 - 752

← 1 2 3 4 5 →