Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations

被引:0
|
作者
Pham, Minh-Quang [1 ]
Indurthi, Sathish Reddy [1 ]
Chollampatt, Shamil [1 ]
Turchi, Marco [1 ]
机构
[1] Zoom Video Commun, San Jose, CA 95113 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) like ChatGPT can be expensive to train, deploy, and use for specific natural language generation tasks such as text summarization and for certain domains. A promising alternative is to fine-tune relatively smaller language models (LMs) on a particular task using high-quality, in-domain datasets. However, it can be prohibitively expensive to get such high-quality training data. This issue has been mitigated by generating weakly supervised data via knowledge distillation (KD) of LLMs. We propose a three-step approach to distill ChatGPT and fine-tune smaller LMs for summarizing forum conversations. More specifically, we design a method to selectively sample a large unannotated corpus of forum conversation using a semantic similarity metric. Then, we use the same metric to retrieve suitable prompts for ChatGPT from a small annotated validation set in the same domain. The generated dataset is then filtered to remove low-quality instances. Our proposed select-prompt-filter KD approach leads to significant improvements of up to 6.6 ROUGE-2 score by leveraging sufficient in-domain pseudo-labelled data, over a standard KD approach given the same size of training data.
引用
收藏
页码:12257 / 12265
页数:9
相关论文
共 50 条
  • [31] Lifting the Curse of Capacity Gap in Distilling Language Models
    Zhang, Chen
    Yang, Yang
    Liu, Jiahao
    Wang, Jingang
    Xian, Yunsen
    Wang, Benyou
    Song, Dawei
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4535 - 4553
  • [32] Optimizing Large Language Models: A Deep Dive into Effective Prompt Engineering Techniques
    Son, Minjun
    Won, Yun-Jae
    Lee, Sungjin
    APPLIED SCIENCES-BASEL, 2025, 15 (03):
  • [33] Interpretable Online Log Analysis Using Large Language Models with Prompt Strategies
    Liu, Yilun
    Tao, Shimin
    Meng, Weibin
    Wang, Jingyu
    Ma, Wenbing
    Chen, Yuhang
    Zhao, Yanqing
    Yang, Hao
    Jiang, Yanfei
    PROCEEDINGS 2024 32ND IEEE/ACM INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, ICPC 2024, 2024, : 35 - 46
  • [34] Prompt engineering on leveraging large language models in generating response to InBasket messages
    Yan, Sherry
    Knapp, Wendi
    Leong, Andrew
    Kadkhodazadeh, Sarira
    Das, Souvik
    Jones, Veena G.
    Clark, Robert
    Grattendick, David
    Chen, Kevin
    Hladik, Lisa
    Fagan, Lawrence
    Chan, Albert
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (10) : 2263 - 2270
  • [35] Flocks of Stochastic Parrots: Differentially Private Prompt Learning for Large Language Models
    Duan, Haonan
    Dziedzic, Adam
    Papernot, Nicolas
    Boenisch, Franziska
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [36] Towards Taming Large Language Models with Prompt Templates for Legal GRL Modeling
    de Kinderen, Sybren
    Winter, Karolin
    ENTERPRISE, BUSINESS-PROCESS AND INFORMATION SYSTEMS MODELING, BPMDS 2024, EMMSAD 2024, 2024, 511 : 213 - 228
  • [37] DrugReAlign: a multisource prompt framework for drug repurposing based on large language models
    Wei, Jinhang
    Zhuo, Linlin
    Fu, Xiangzheng
    Zeng, Xiangxiang
    Wang, Li
    Zou, Quan
    Cao, Dongsheng
    BMC BIOLOGY, 2024, 22 (01)
  • [38] Turning Large Language Models into AI Assistants for Startups Using Prompt Patterns
    Wang, Xiaofeng
    Attal, Mohammad Idris
    Rafiq, Usman
    Hubner-Benz, Sylvia
    AGILE PROCESSES IN SOFTWARE ENGINEERING AND EXTREME PROGRAMMING - WORKSHOPS, XP 2022 WORKSHOPS, XP 2023 WORKSHOPS, 2024, 489 : 192 - 200
  • [39] Biomedical knowledge graph-optimized prompt generation for large language models
    Soman, Karthik
    Rose, Peter W.
    Morris, John H.
    Akbas, Rabia E.
    Smith, Brett
    Peetoom, Braian
    Villouta-Reyes, Catalina
    Cerono, Gabriel
    Shi, Yongmei
    Rizk-Jackson, Angela
    Israni, Sharat
    Nelson, Charlotte A.
    Huang, Sui
    Baranzini, Sergio E.
    BIOINFORMATICS, 2024, 40 (09)
  • [40] Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints
    Lu, Albert
    Zhang, Hongxin
    Zhang, Yanzhe
    Wang, Xuezhi
    Yang, Diyi
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1982 - 2008