Eliciting knowledge from language models with automatically generated continuous prompts

被引:1
|
作者
Chen, Yadang [1 ,2 ]
Yang, Gang [1 ,2 ]
Wang, Duolin [1 ,2 ]
Li, Dichao [3 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Comp Sci, Nanjing 210044, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, Engn Res Ctr Digital Forens, Minist Educ, Nanjing 210044, Peoples R China
[3] Nanjing Univ Informat Sci & Technol, Sch Artificial Intelligence, Nanjing 210044, Peoples R China
关键词
Prompt learning; Initialization; Trigger token; Continuous parameters;
D O I
10.1016/j.eswa.2023.122327
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained Language Models (PLMs) have demonstrated remarkable performance in Natural Language Under-standing (NLU) tasks, with continuous prompt-based fine-tuning further enhancing their capabilities. However, current methods rely on hand-crafted discrete prompts to initialize continuous prompts, which are sensitive to subtle changes and inherently limited by the constraints of natural language. To address these limitations, this study introduces an innovative AutoPrompt-based Prompt Tuning (APT) approach. APT optimizes the initialization of continuous prompts by employing a gradient-guided automatic search to generate ideal discrete templates and identify trigger tokens. As the semantic features are already captured from the target task dataset, the continuous parameters initialized by trigger tokens are highly relevant, providing a superior starting point for prompt-tuning. APT searches for optimal prompts across various NLU tasks, enabling the PLM to learn task-related knowledge effectively. The APT method significantly improves PLM performance in both few-shot and fully supervised settings, eliminating the need for extensive prompt engineering. In the knowledge exploration (Language Model Analysis (LAMA)) benchmark, APT achieved a remarkable 58.6% (P@1) performance without additional text, representing a 3.6% improvement over the previous best result. Additionally, APT outperformed state-of-the-art methods in the SuperGLUE benchmark.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] AUTOPROMPT: Eliciting Knowledge from Language Models with Automatically Generated Prompts
    Shin, Taylor
    Razeghi, Yasaman
    Logan, Robert L., IV
    Wallace, Eric
    Singh, Sameer
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4222 - 4235
  • [2] Eliciting Knowledge from Pretrained Language Models for Prototypical Prompt Verbalizer
    Wei, Yinyi
    Mo, Tong
    Jiang, Yongtao
    Li, Weiping
    Zhao, Wen
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II, 2022, 13530 : 222 - 233
  • [3] NATURAL LANGUAGE INTERACTION BASED ON AUTOMATICALLY GENERATED CONCEPTUAL MODELS
    Perez-Marin, Diana
    Pascual-Nieto, Ismael
    Marin, Pilar Rodriguez
    ICEIS 2008: PROCEEDINGS OF THE TENTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL HCI: HUMAN-COMPUTER INTERACTION, 2008, : 5 - 12
  • [4] Eliciting Collective Behaviors through Automatically Generated Environments
    Fine, Benjamin T.
    Shell, Dylan A.
    2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2013, : 3303 - 3308
  • [5] GenKP: generative knowledge prompts for enhancing large language models
    Li, Xinbai
    Peng, Shaowen
    Yada, Shuntaro
    Wakamiya, Shoko
    Aramaki, Eiji
    APPLIED INTELLIGENCE, 2025, 55 (06)
  • [6] Few-Shot Relation Extraction With Automatically Generated Prompts
    Zhao, Xiaoyan
    Yang, Min
    Qu, Qiang
    Xu, Ruifeng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 13
  • [7] Few-Shot Relation Extraction With Automatically Generated Prompts
    Zhao, Xiaoyan
    Yang, Min
    Qu, Qiang
    Xu, Ruifeng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (03) : 4971 - 4983
  • [8] Unnatural language processing: How do language models handle machine-generated prompts?
    Kervadec, Corentin
    Franzon, Francesca
    Baroni, Marco
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 14377 - 14392
  • [9] Vulnerability Analysis of Continuous Prompts for Pre-trained Language Models
    Li, Zhicheng
    Shi, Yundi
    Sheng, Xuan
    Yin, Changchun
    Zhou, Lu
    Li, Piji
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT IX, 2023, 14262 : 508 - 519
  • [10] EVALUATION OF AUTOMATICALLY GENERATED VIDEO CAPTIONS USING VISION AND LANGUAGE MODELS
    Lebron, Luis
    Graham, Yvette
    O'Connor, Noel E.
    McGuinness, Kevin
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2416 - 2420