Guiding Large Language Models via Directional Stimulus Prompting

被引：0

作者：

Li, Zekun ^{[1
,3
]}

Peng, Baolin ^{[2
]}

He, Pengcheng ^{[2
]}

Galley, Michel ^{[2
]}

Gao, Jianfeng ^{[2
]}

Yan, Xifeng ^{[1
]}

机构：

[1] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA

[2] Microsoft, Redmond, WA USA

[3] Microsoft Res, Redmond, WA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) towards specific desired outputs. Instead of directly adjusting LLMs, our method employs a small tunable policy model (e.g., T5) to generate an auxiliary directional stimulus prompt for each input instance. These directional stimulus prompts act as nuanced, instance-specific hints and clues to guide LLMs in generating desired outcomes, such as including specific keywords in the generated summary. Our approach sidesteps the challenges of direct LLM tuning by optimizing the policy model to explore directional stimulus prompts that align LLMs with desired behaviors. The policy model can be optimized through 1) supervised fine-tuning using labeled data and 2) reinforcement learning from offline or online rewards based on the LLM's output. We evaluate our method across various tasks, including summarization, dialogue response generation, and chain-of-thought reasoning. Our experiments indicate a consistent improvement in the performance of LLMs such as ChatGPT, Codex, and InstructGPT on these supervised tasks with minimal labeled data. Remarkably, by utilizing merely 80 dialogues from the MultiWOZ dataset, our approach boosts ChatGPT's performance by a relative 41.4%, achieving or exceeding the performance of some fully supervised state-of-the-art models. Moreover, the instance-specific chain-of-thought prompt generated through our method enhances InstructGPT's reasoning accuracy, outperforming both generalized human-crafted prompts and those generated through automatic prompt engineering. The code and data are publicly available.(3)

引用

页数：27

共 50 条

[1] Considerations for Prompting Large Language Models
Schulte, Brian
[J]. JAMA ONCOLOGY, 2024, 10 (04) : 475 - 483
[2] Prompting Is Programming: A Query Language for Large Language Models
Beurer-Kellner, Luca
Fischer, Marc
Vechev, Martin
[J]. PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2023, 7 (PLDI):
[3] Graph Neural Prompting with Large Language Models
Tian, Yijun
Song, Huan
Wang, Zichen
Wang, Haozhu
Hu, Ziqing
Wang, Fang
Chawla, Nitesh V.
Xu, Panpan
[J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19080 - 19088
[4] Prompting Large Language Models With the Socratic Method
Chang, Edward Y.
[J]. 2023 IEEE 13TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE, CCWC, 2023, : 351 - 360
[5] Prompting Large Language Models to Power Educational Chatbots
Farah, Juan Carlos
Ingram, Sandy
Spaenlehauer, Basile
Lasne, Fanny Kim-Lan
Gillet, Denis
[J]. ADVANCES IN WEB-BASED LEARNING, ICWL 2023, 2023, 14409 : 169 - 188
[6] Editing Graph Visualizations by Prompting Large Language Models
Argyriou, Evmorfia
Boehm, Jens
Eberle, Anne
Gonser, Julius
Lumpp, Anna-Lena
Niedermann, Benjamin
Schwarzkopf, Fabian
[J]. GRAPH DRAWING AND NETWORK VISUALIZATION, GD 2023, PT II, 2023, 14466 : 253 - 254
[7] Considerations for Prompting Large Language Models-Reply
Chen, Shan
Savova, Guergana K.
Bitterman, Danielle S.
[J]. JAMA ONCOLOGY, 2024, 10 (04) : 526 - 530
[8] Automatic Lesson Plan Generation via Large Language Models with Self-critique Prompting
Zheng, Ying
Li, Xueyi
Huang, Yaying
Liang, Qianru
Guo, Teng
Hou, Mingliang
Gao, Boyu
Tian, Mi
Liu, Zitao
Luo, Weiqi
[J]. ARTIFICIAL INTELLIGENCE IN EDUCATION: POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS, DOCTORAL CONSORTIUM AND BLUE SKY, AIED 2024, PT I, 2024, 2150 : 163 - 178
[9] The Art of Asking: Prompting Large Language Models for Serendipity Recommendations
Fu, Zhe
Niu, Xi
[J]. PROCEEDINGS OF THE 2024 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2024, 2024, : 157 - 166
[10] Grammar Prompting for Domain-Specific Language Generation with Large Language Models
Wang, Bailin
Wang, Zi
Wang, Xuezhi
Cao, Yuan
Saurous, Rif A.
Kim, Yoon
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →