Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations

被引：0

作者：

Pham, Minh-Quang ^{[1
]}

Indurthi, Sathish Reddy ^{[1
]}

Chollampatt, Shamil ^{[1
]}

Turchi, Marco ^{[1
]}

机构：

[1] Zoom Video Commun, San Jose, CA 95113 USA

来源：

2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large language models (LLMs) like ChatGPT can be expensive to train, deploy, and use for specific natural language generation tasks such as text summarization and for certain domains. A promising alternative is to fine-tune relatively smaller language models (LMs) on a particular task using high-quality, in-domain datasets. However, it can be prohibitively expensive to get such high-quality training data. This issue has been mitigated by generating weakly supervised data via knowledge distillation (KD) of LLMs. We propose a three-step approach to distill ChatGPT and fine-tune smaller LMs for summarizing forum conversations. More specifically, we design a method to selectively sample a large unannotated corpus of forum conversation using a semantic similarity metric. Then, we use the same metric to retrieve suitable prompts for ChatGPT from a small annotated validation set in the same domain. The generated dataset is then filtered to remove low-quality instances. Our proposed select-prompt-filter KD approach leads to significant improvements of up to 6.6 ROUGE-2 score by leveraging sufficient in-domain pseudo-labelled data, over a standard KD approach given the same size of training data.

引用

页码：12257 / 12265

页数：9

共 50 条

[21] Distilling Reasoning Capabilities into Smaller Language Models
Shridhar, Kumar
Stolfo, Alessandro
Sachan, Mrinmaya
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 7059 - 7073
[22] Assessing the Impact of Prompt Strategies on Text Summarization with Large Language Models
Onan, Aytug
Alhumyani, Hesham
COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, CAINE 2024, 2025, 2242 : 41 - 55
[23] Soft prompt tuning for augmenting dense retrieval with large language models
Peng, Zhiyuan
Wu, Xuyang
Wang, Qifan
Fang, Yi
KNOWLEDGE-BASED SYSTEMS, 2025, 309
[24] CSPO: chain-structured prompt optimisation for large language models
Wang, Jinshui
Lin, Sining
Xue, Xingsi
Chen, Shuguang
Tang, Zhengyi
International Journal of Ad Hoc and Ubiquitous Computing, 2025, 48 (04) : 233 - 243
[25] Robust Prompt Optimization for Large Language Models Against Distribution Shifts
Li, Moxin
Wang, Wenjie
Feng, Fuli
Cao, Yixin
Zhang, Jizhi
Chua, Tat-Seng
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 1539 - 1554
[26] Prompt Wrangling: On Replication and Generalization in Large Language Models for PCG Levels
Karkaj, Arash Moradi
Nelson, Mark J.
Koutis, Ioannis
Hoover, Amy K.
PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON THE FOUNDATIONS OF DIGITAL GAMES, FDG 2024, 2024,
[27] You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content
He, Xinlei
Zannettou, Savvas
Shen, Yun
Zhang, Yang
45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 770 - 787
[28] Distilling implicit multimodal knowledge into large language models for zero-resource dialogue generation
Zhang, Bo
Ma, Hui
Ding, Jian
Wang, Jian
Xu, Bo
Lin, Hongfei
INFORMATION FUSION, 2025, 118
[29] A Security Risk Taxonomy for Prompt-Based Interaction With Large Language Models
Derner, Erik
Batistic, Kristina
Zahalka, Jan
Babuska, Robert
IEEE ACCESS, 2024, 12 : 126176 - 126187
[30] TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models
Xue, Jiaqi
Zheng, Mengxin
Hua, Ting
Shen, Yilin
Liu, Yepeng
Boloni, Ladislau
Lou, Qian
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →