Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations

被引:0
|
作者
Pham, Minh-Quang [1 ]
Indurthi, Sathish Reddy [1 ]
Chollampatt, Shamil [1 ]
Turchi, Marco [1 ]
机构
[1] Zoom Video Commun, San Jose, CA 95113 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) like ChatGPT can be expensive to train, deploy, and use for specific natural language generation tasks such as text summarization and for certain domains. A promising alternative is to fine-tune relatively smaller language models (LMs) on a particular task using high-quality, in-domain datasets. However, it can be prohibitively expensive to get such high-quality training data. This issue has been mitigated by generating weakly supervised data via knowledge distillation (KD) of LLMs. We propose a three-step approach to distill ChatGPT and fine-tune smaller LMs for summarizing forum conversations. More specifically, we design a method to selectively sample a large unannotated corpus of forum conversation using a semantic similarity metric. Then, we use the same metric to retrieve suitable prompts for ChatGPT from a small annotated validation set in the same domain. The generated dataset is then filtered to remove low-quality instances. Our proposed select-prompt-filter KD approach leads to significant improvements of up to 6.6 ROUGE-2 score by leveraging sufficient in-domain pseudo-labelled data, over a standard KD approach given the same size of training data.
引用
收藏
页码:12257 / 12265
页数:9
相关论文
共 50 条
  • [1] DISCO: Distilling Counterfactuals with Large Language Models
    Chen, Zeming
    Gao, Qiyue
    Bosselut, Antoine
    Sabharwal, Ashish
    Richardson, Kyle
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 5514 - 5528
  • [2] Prompt Optimization in Large Language Models
    Sabbatella, Antonio
    Ponti, Andrea
    Giordani, Ilaria
    Candelieri, Antonio
    Archetti, Francesco
    MATHEMATICS, 2024, 12 (06)
  • [3] Conversations on reasoning: Large language models in diagnosis
    Restrepo, Daniel
    Rodman, Adam
    Abdulnour, Raja-Elie
    JOURNAL OF HOSPITAL MEDICINE, 2024, 19 (08) : 731 - 735
  • [4] Distilling Script Knowledge from Large Language Models for Constrained Language Planning
    Yuan, Siyu
    Chen, Jiangjie
    Fu, Ziquan
    Ge, Xuyang
    Shah, Soham
    Jankowski, Charles Robert
    Xiao, Yanghua
    Yang, Deqing
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4303 - 4325
  • [5] Distilling large language models for matching patients to clinical trials
    Nievas, Mauro
    Basu, Aditya
    Wang, Yanshan
    Singh, Hrituraj
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09) : 1953 - 1963
  • [6] Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language Models
    Zhang, Jiang
    Wu, Qiong
    Xu, Yiming
    Cao, Cheng
    Du, Zheng
    Psounis, Konstantinos
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21779 - 21787
  • [7] WORDFLOW: Social Prompt Engineering for Large Language Models
    Wang, Zijie J.
    Chakravarthy, Aishwarya
    Munechika, David
    Chau, Duen Horng
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 3: SYSTEM DEMONSTRATIONS, 2024, : 42 - 50
  • [8] Distilling Vision-Language Foundation Models: A Data-Free Approach via Prompt Diversification
    Xuan, Yunyi
    Chen, Weijie
    Yang, Shicai
    Xie, Di
    Lin, Luojun
    Zhuang, Yueting
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4928 - 4938
  • [9] To prompt or not to prompt: Navigating the use of Large Language Models for integrating and modeling heterogeneous data
    Remadi, Adel
    El Hage, Karim
    Hobeika, Yasmina
    Bugiotti, Francesca
    DATA & KNOWLEDGE ENGINEERING, 2024, 152
  • [10] PromptMaker: Prompt-based Prototyping with Large Language Models
    Jiang, Ellen
    Olson, Kristen
    Toh, Edwin
    Molina, Alejandra
    Donsbach, Aaron
    Terry, Michael
    Cai, Carrie J.
    EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022, 2022,