Exploring Lottery Prompts for Pre-trained Language Models

被引:0
|
作者
Chen, Yulin [1 ]
Ding, Ning [1 ,2 ]
Wang, Xiaobin [3 ]
Hu, Shengding [2 ]
Zheng, Hai-Tao [1 ,4 ]
Liu, Zhiyuan [2 ,5 ,6 ]
Xie, Pengjun [3 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Beijing, Peoples R China
[2] Tsinghua Univ, DCST, Beijing, Peoples R China
[3] Alibaba Grp, Hangzhou, Peoples R China
[4] Pengcheng Lab, Shenzhen, Peoples R China
[5] Tsinghua Univ, IAI, BNRIST, Beijing, Peoples R China
[6] Jiangsu Normal Univ, Jiangsu Collaborat Innovat Ctr Language Abil, Xuzhou, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Consistently scaling pre-trained language models (PLMs) imposes substantial burdens on model adaptation, necessitating more efficient alternatives to conventional fine-tuning. Given the advantage of prompting in the zero-shot setting and the observed performance fluctuation among different prompts, we explore the instance-level prompt and their generalizability. By searching through the prompt space, we first validate the assumption that for every instance, there is almost always a lottery prompt that induces the correct prediction from the PLM, and such prompt can be obtained at a low cost thanks to the inherent ability of PLMs. Meanwhile, we find that some strong lottery prompts have high performance over the whole training set, and they are equipped with distinguishable linguistic features. Lastly, we attempt to generalize the searched strong lottery prompts to unseen data with prompt ensembling method without any parameter tuning. Experiments are conducted on various types of NLP classification tasks and demonstrate that the proposed method can achieve comparable results with other gradient-free and optimization-free baselines.
引用
收藏
页码:15428 / 15444
页数:17
相关论文
共 50 条
  • [31] A Close Look into the Calibration of Pre-trained Language Models
    Chen, Yangyi
    Yuan, Lifan
    Cui, Ganqu
    Liu, Zhiyuan
    Ji, Heng
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1343 - 1367
  • [32] Deep Entity Matching with Pre-Trained Language Models
    Li, Yuliang
    Li, Jinfeng
    Suhara, Yoshihiko
    Doan, AnHai
    Tan, Wang-Chiew
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (01): : 50 - 60
  • [33] Self-conditioning Pre-Trained Language Models
    Suau, Xavier
    Zappella, Luca
    Apostoloff, Nicholas
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [34] A Survey of Knowledge Enhanced Pre-Trained Language Models
    Hu, Linmei
    Liu, Zeyi
    Zhao, Ziwang
    Hou, Lei
    Nie, Liqiang
    Li, Juanzi
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1413 - 1430
  • [35] Context Analysis for Pre-trained Masked Language Models
    Lai, Yi-An
    Lalwani, Garima
    Zhang, Yi
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 3789 - 3804
  • [36] Evaluating the Summarization Comprehension of Pre-Trained Language Models
    Chernyshev, D. I.
    Dobrov, B. V.
    [J]. LOBACHEVSKII JOURNAL OF MATHEMATICS, 2023, 44 (08) : 3028 - 3039
  • [37] Pre-trained models for natural language processing: A survey
    QIU XiPeng
    SUN TianXiang
    XU YiGe
    SHAO YunFan
    DAI Ning
    HUANG XuanJing
    [J]. Science China Technological Sciences, 2020, 63 (10) : 1872 - 1897
  • [38] Empowering News Recommendation with Pre-trained Language Models
    Wu, Chuhan
    Wu, Fangzhao
    Qi, Tao
    Huang, Yongfeng
    [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1652 - 1656
  • [39] Pre-trained language models: What do they know?
    Guimaraes, Nuno
    Campos, Ricardo
    Jorge, Alipio
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 14 (01)
  • [40] Understanding Online Attitudes with Pre-Trained Language Models
    Power, William
    Obradovic, Zoran
    [J]. PROCEEDINGS OF THE 2023 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2023, 2023, : 745 - 752