Eliciting knowledge from language models with automatically generated continuous prompts

被引：1

作者：

Chen, Yadang ^{[1
,2
]}

Yang, Gang ^{[1
,2
]}

Wang, Duolin ^{[1
,2
]}

Li, Dichao ^{[3
]}

机构：

[1] Nanjing Univ Informat Sci & Technol, Sch Comp Sci, Nanjing 210044, Peoples R China

[2] Nanjing Univ Informat Sci & Technol, Engn Res Ctr Digital Forens, Minist Educ, Nanjing 210044, Peoples R China

[3] Nanjing Univ Informat Sci & Technol, Sch Artificial Intelligence, Nanjing 210044, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 239卷

关键词：

Prompt learning; Initialization; Trigger token; Continuous parameters;

D O I：

10.1016/j.eswa.2023.122327

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pre-trained Language Models (PLMs) have demonstrated remarkable performance in Natural Language Under-standing (NLU) tasks, with continuous prompt-based fine-tuning further enhancing their capabilities. However, current methods rely on hand-crafted discrete prompts to initialize continuous prompts, which are sensitive to subtle changes and inherently limited by the constraints of natural language. To address these limitations, this study introduces an innovative AutoPrompt-based Prompt Tuning (APT) approach. APT optimizes the initialization of continuous prompts by employing a gradient-guided automatic search to generate ideal discrete templates and identify trigger tokens. As the semantic features are already captured from the target task dataset, the continuous parameters initialized by trigger tokens are highly relevant, providing a superior starting point for prompt-tuning. APT searches for optimal prompts across various NLU tasks, enabling the PLM to learn task-related knowledge effectively. The APT method significantly improves PLM performance in both few-shot and fully supervised settings, eliminating the need for extensive prompt engineering. In the knowledge exploration (Language Model Analysis (LAMA)) benchmark, APT achieved a remarkable 58.6% (P@1) performance without additional text, representing a 3.6% improvement over the previous best result. Additionally, APT outperformed state-of-the-art methods in the SuperGLUE benchmark.

引用

页数：12

共 50 条

[41] GenKP: generative knowledge prompts for enhancing large language modelsGenKP: generative knowledge prompts for enhancing large language modelsX. Li et al.
Xinbai Li
Shaowen Peng
Shuntaro Yada
Shoko Wakamiya
Eiji Aramaki
Applied Intelligence, 2025, 55 (7)
[42] Extensible Prompts for Language Models on Zero-shot Language Style Customization
Ge, Tao
Hu, Jing
Dong, Li
Mao, Shaoguang
Xia, Yan
Wang, Xun
Chen, Si-Qing
Wei, Furu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[43] Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models
Cheng, Myra
Durmus, Esin
Jurafsky, Dan
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1504 - 1532
[44] Narrative knowledge: eliciting organisational knowledge from storytelling
Hannabuss, S
ASLIB PROCEEDINGS, 2000, 52 (10): : 402 - 413
[45] Automatically Extractino Procedural Knowledge from Instructional Texts using Natural Language Processing
Zhang, Ziqi
Webster, Philip
Uren, Victoria
Varga, Andrea
Ciravegna, Fabio
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 520 - 527
[46] Handling Uncertainty in Automatically Generated Implementation Models in the Automotive Domain
Bucaioni, Alessio
Cicchetti, Antonio
Ciccozzi, Federico
Mubeen, Saad
Pierantonio, Alfonso
Sjodin, Mikael
2016 42ND EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA), 2016, : 173 - 180
[47] Comprehensive comparison of automatically generated QSAR models of target potency
Champness, Edmund
Segall, Matthew
Gola, Joelle
Lariviere, Delphine
Leeding, Chris
Yusof, Iskander
Chisholm, James
ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 243
[48] Showing automatically generated students' conceptual models to students and teachers
Pérez-Marín D.
Pascual-Nieto I.
International Journal of Artificial Intelligence in Education, 2010, 20 (01) : 47 - 72
[49] Eliciting Affective Events from Language Models by Multiple View Co-prompting
Zhuang, Yuan
Riloff, Ellen
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3189 - 3201
[50] Exploring Lottery Prompts for Pre-trained Language Models
Chen, Yulin
Ding, Ning
Wang, Xiaobin
Hu, Shengding
Zheng, Hai-Tao
Liu, Zhiyuan
Xie, Pengjun
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15428 - 15444

← 1 2 3 4 5 →