Prompt Optimization in Large Language Models

被引:0
|
作者
Sabbatella, Antonio [1 ]
Ponti, Andrea [2 ]
Giordani, Ilaria [3 ]
Candelieri, Antonio [2 ]
Archetti, Francesco [1 ]
机构
[1] Univ Milano Bicocca, Dept Comp Sci Syst & Commun, I-20126 Milan, Italy
[2] Univ Milano Bicocca, Dept Econ Management & Stat, I-20126 Milan, Italy
[3] Oaks srl, I-20125 Milan, Italy
关键词
Bayesian Optimization; prompt optimization; black-box Large Language Models;
D O I
10.3390/math12060929
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Prompt optimization is a crucial task for improving the performance of large language models for downstream tasks. In this paper, a prompt is a sequence of n-grams selected from a vocabulary. Consequently, the aim is to select the optimal prompt concerning a certain performance metric. Prompt optimization can be considered as a combinatorial optimization problem, with the number of possible prompts (i.e., the combinatorial search space) given by the size of the vocabulary (i.e., all the possible n-grams) raised to the power of the length of the prompt. Exhaustive search is impractical; thus, an efficient search strategy is needed. We propose a Bayesian Optimization method performed over a continuous relaxation of the combinatorial search space. Bayesian Optimization is the dominant approach in black-box optimization for its sample efficiency, along with its modular structure and versatility. We use BoTorch, a library for Bayesian Optimization research built on top of PyTorch. Specifically, we focus on Hard Prompt Tuning, which directly searches for an optimal prompt to be added to the text input without requiring access to the Large Language Model, using it as a black-box (such as for GPT-4 which is available as a Model as a Service). Albeit preliminary and based on "vanilla" Bayesian Optimization algorithms, our experiments with RoBERTa as a large language model, on six benchmark datasets, show good performances when compared against other state-of-the-art black-box prompt optimization methods and enable an analysis of the trade-off between the size of the search space, accuracy, and wall-clock time.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs)
    Nazzal, Mahmoud
    Khalil, Issa
    Khreishah, Abdallah
    Phan, NhatHai
    CCS 2024 - Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security, : 2266 - 2279
  • [2] To prompt or not to prompt: Navigating the use of Large Language Models for integrating and modeling heterogeneous data
    Remadi, Adel
    El Hage, Karim
    Hobeika, Yasmina
    Bugiotti, Francesca
    DATA & KNOWLEDGE ENGINEERING, 2024, 152
  • [3] Response Generated by Large Language Models Depends on the Structure of the Prompt
    Sarangi, Pradosh Kumar
    Mondal, Himel
    INDIAN JOURNAL OF RADIOLOGY AND IMAGING, 2024, 34 (03): : 574 - 575
  • [4] PromptMaker: Prompt-based Prototyping with Large Language Models
    Jiang, Ellen
    Olson, Kristen
    Toh, Edwin
    Molina, Alejandra
    Donsbach, Aaron
    Terry, Michael
    Cai, Carrie J.
    EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022, 2022,
  • [5] Prompt Engineering: Guiding the Way to Effective Large Language Models
    Aljanabi M.
    Yaseen M.G.
    Ali A.H.
    Mohammed M.A.
    Iraqi Journal for Computer Science and Mathematics, 2023, 4 (04): : 151 - 155
  • [6] Prompt text classifications with transformer models! An exemplary introduction to prompt-based learning with large language models
    Mayer, Christian W. F.
    Ludwig, Sabrina
    Brandt, Steffen
    JOURNAL OF RESEARCH ON TECHNOLOGY IN EDUCATION, 2023, 55 (01) : 125 - 141
  • [7] The Effect of Prompt Types on Text Summarization Performance With Large Language Models
    Borhan, Iffat
    Bajaj, Akhilesh
    Journal of Database Management, 2024, 35 (01)
  • [8] Soft prompt tuning for augmenting dense retrieval with large language models
    Peng, Zhiyuan
    Wu, Xuyang
    Wang, Qifan
    Fang, Yi
    Knowledge-Based Systems, 2025, 309
  • [9] Prompt Wrangling: On Replication and Generalization in Large Language Models for PCG Levels
    Karkaj, Arash Moradi
    Nelson, Mark J.
    Koutis, Ioannis
    Hoover, Amy K.
    PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON THE FOUNDATIONS OF DIGITAL GAMES, FDG 2024, 2024,
  • [10] TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models
    Xue, Jiaqi
    Zheng, Mengxin
    Hua, Ting
    Shen, Yilin
    Liu, Yepeng
    Boloni, Ladislau
    Lou, Qian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,