Eliciting knowledge from language models with automatically generated continuous prompts

被引:1
|
作者
Chen, Yadang [1 ,2 ]
Yang, Gang [1 ,2 ]
Wang, Duolin [1 ,2 ]
Li, Dichao [3 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Comp Sci, Nanjing 210044, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, Engn Res Ctr Digital Forens, Minist Educ, Nanjing 210044, Peoples R China
[3] Nanjing Univ Informat Sci & Technol, Sch Artificial Intelligence, Nanjing 210044, Peoples R China
关键词
Prompt learning; Initialization; Trigger token; Continuous parameters;
D O I
10.1016/j.eswa.2023.122327
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained Language Models (PLMs) have demonstrated remarkable performance in Natural Language Under-standing (NLU) tasks, with continuous prompt-based fine-tuning further enhancing their capabilities. However, current methods rely on hand-crafted discrete prompts to initialize continuous prompts, which are sensitive to subtle changes and inherently limited by the constraints of natural language. To address these limitations, this study introduces an innovative AutoPrompt-based Prompt Tuning (APT) approach. APT optimizes the initialization of continuous prompts by employing a gradient-guided automatic search to generate ideal discrete templates and identify trigger tokens. As the semantic features are already captured from the target task dataset, the continuous parameters initialized by trigger tokens are highly relevant, providing a superior starting point for prompt-tuning. APT searches for optimal prompts across various NLU tasks, enabling the PLM to learn task-related knowledge effectively. The APT method significantly improves PLM performance in both few-shot and fully supervised settings, eliminating the need for extensive prompt engineering. In the knowledge exploration (Language Model Analysis (LAMA)) benchmark, APT achieved a remarkable 58.6% (P@1) performance without additional text, representing a 3.6% improvement over the previous best result. Additionally, APT outperformed state-of-the-art methods in the SuperGLUE benchmark.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] GenKP: generative knowledge prompts for enhancing large language modelsGenKP: generative knowledge prompts for enhancing large language modelsX. Li et al.
    Xinbai Li
    Shaowen Peng
    Shuntaro Yada
    Shoko Wakamiya
    Eiji Aramaki
    Applied Intelligence, 2025, 55 (7)
  • [42] Extensible Prompts for Language Models on Zero-shot Language Style Customization
    Ge, Tao
    Hu, Jing
    Dong, Li
    Mao, Shaoguang
    Xia, Yan
    Wang, Xun
    Chen, Si-Qing
    Wei, Furu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [43] Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models
    Cheng, Myra
    Durmus, Esin
    Jurafsky, Dan
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1504 - 1532
  • [44] Narrative knowledge: eliciting organisational knowledge from storytelling
    Hannabuss, S
    ASLIB PROCEEDINGS, 2000, 52 (10): : 402 - 413
  • [45] Automatically Extractino Procedural Knowledge from Instructional Texts using Natural Language Processing
    Zhang, Ziqi
    Webster, Philip
    Uren, Victoria
    Varga, Andrea
    Ciravegna, Fabio
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 520 - 527
  • [46] Handling Uncertainty in Automatically Generated Implementation Models in the Automotive Domain
    Bucaioni, Alessio
    Cicchetti, Antonio
    Ciccozzi, Federico
    Mubeen, Saad
    Pierantonio, Alfonso
    Sjodin, Mikael
    2016 42ND EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA), 2016, : 173 - 180
  • [47] Comprehensive comparison of automatically generated QSAR models of target potency
    Champness, Edmund
    Segall, Matthew
    Gola, Joelle
    Lariviere, Delphine
    Leeding, Chris
    Yusof, Iskander
    Chisholm, James
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 243
  • [48] Showing automatically generated students' conceptual models to students and teachers
    Pérez-Marín D.
    Pascual-Nieto I.
    International Journal of Artificial Intelligence in Education, 2010, 20 (01) : 47 - 72
  • [49] Eliciting Affective Events from Language Models by Multiple View Co-prompting
    Zhuang, Yuan
    Riloff, Ellen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3189 - 3201
  • [50] Exploring Lottery Prompts for Pre-trained Language Models
    Chen, Yulin
    Ding, Ning
    Wang, Xiaobin
    Hu, Shengding
    Zheng, Hai-Tao
    Liu, Zhiyuan
    Xie, Pengjun
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15428 - 15444