Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations

被引：0

作者：

Pham, Minh-Quang ^{[1
]}

Indurthi, Sathish Reddy ^{[1
]}

Chollampatt, Shamil ^{[1
]}

Turchi, Marco ^{[1
]}

机构：

[1] Zoom Video Commun, San Jose, CA 95113 USA

来源：

2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large language models (LLMs) like ChatGPT can be expensive to train, deploy, and use for specific natural language generation tasks such as text summarization and for certain domains. A promising alternative is to fine-tune relatively smaller language models (LMs) on a particular task using high-quality, in-domain datasets. However, it can be prohibitively expensive to get such high-quality training data. This issue has been mitigated by generating weakly supervised data via knowledge distillation (KD) of LLMs. We propose a three-step approach to distill ChatGPT and fine-tune smaller LMs for summarizing forum conversations. More specifically, we design a method to selectively sample a large unannotated corpus of forum conversation using a semantic similarity metric. Then, we use the same metric to retrieve suitable prompts for ChatGPT from a small annotated validation set in the same domain. The generated dataset is then filtered to remove low-quality instances. Our proposed select-prompt-filter KD approach leads to significant improvements of up to 6.6 ROUGE-2 score by leveraging sufficient in-domain pseudo-labelled data, over a standard KD approach given the same size of training data.

引用

页码：12257 / 12265

页数：9

共 50 条

[1] DISCO: Distilling Counterfactuals with Large Language Models
Chen, Zeming
Gao, Qiyue
Bosselut, Antoine
Sabharwal, Ashish
Richardson, Kyle
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 5514 - 5528
[2] Prompt Optimization in Large Language Models
Sabbatella, Antonio
Ponti, Andrea
Giordani, Ilaria
Candelieri, Antonio
Archetti, Francesco
MATHEMATICS, 2024, 12 (06)
[3] Conversations on reasoning: Large language models in diagnosis
Restrepo, Daniel
Rodman, Adam
Abdulnour, Raja-Elie
JOURNAL OF HOSPITAL MEDICINE, 2024, 19 (08) : 731 - 735
[4] Distilling Script Knowledge from Large Language Models for Constrained Language Planning
Yuan, Siyu
Chen, Jiangjie
Fu, Ziquan
Ge, Xuyang
Shah, Soham
Jankowski, Charles Robert
Xiao, Yanghua
Yang, Deqing
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4303 - 4325
[5] Distilling large language models for matching patients to clinical trials
Nievas, Mauro
Basu, Aditya
Wang, Yanshan
Singh, Hrituraj
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09) : 1953 - 1963
[6] Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language Models
Zhang, Jiang
Wu, Qiong
Xu, Yiming
Cao, Cheng
Du, Zheng
Psounis, Konstantinos
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21779 - 21787
[7] WORDFLOW: Social Prompt Engineering for Large Language Models
Wang, Zijie J.
Chakravarthy, Aishwarya
Munechika, David
Chau, Duen Horng
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 3: SYSTEM DEMONSTRATIONS, 2024, : 42 - 50
[8] Distilling Vision-Language Foundation Models: A Data-Free Approach via Prompt Diversification
Xuan, Yunyi
Chen, Weijie
Yang, Shicai
Xie, Di
Lin, Luojun
Zhuang, Yueting
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4928 - 4938
[9] To prompt or not to prompt: Navigating the use of Large Language Models for integrating and modeling heterogeneous data
Remadi, Adel
El Hage, Karim
Hobeika, Yasmina
Bugiotti, Francesca
DATA & KNOWLEDGE ENGINEERING, 2024, 152
[10] PromptMaker: Prompt-based Prototyping with Large Language Models
Jiang, Ellen
Olson, Kristen
Toh, Edwin
Molina, Alejandra
Donsbach, Aaron
Terry, Michael
Cai, Carrie J.
EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022, 2022,

← 1 2 3 4 5 →