Distilling the Knowledge of Large-scale Generative Models into Retrieval Models for Efficient Open-domain Conversation

被引:0
|
作者
Kim, Beomsu [1 ]
Seo, Seokjun [1 ]
Han, Seungju [1 ]
Erdenee, Enkhbayar [1 ]
Chang, Buru [1 ]
机构
[1] Hyperconnect, Seoul, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the remarkable performance of largescale generative models in open-domain conversation, they are known to be less practical for building real-time conversation systems due to high latency. On the other hand, retrieval models could return responses with much lower latency but show inferior performance to the large-scale generative models since the conversation quality is bounded by the pre-defined response set. To take advantage of both approaches, we propose a new training method called G2R (Generative-toRetrieval distillation) that preserves the efficiency of a retrieval model while leveraging the conversational ability of a large-scale generative model by infusing the knowledge of the generative model into the retrieval model. G2R consists of two distinct techniques of distillation: the data-level G2R augments the dialogue dataset with additional responses generated by the large-scale generative model, and the model-level G2R transfers the response quality score assessed by the generative model to the score of the retrieval model by the knowledge distillation loss. Through extensive experiments including human evaluation, we demonstrate that our retrieval-based conversation system trained with G2R shows a substantially improved performance compared to the baseline retrieval model while showing significantly lower inference latency than the largescale generative models.
引用
收藏
页码:3357 / 3373
页数:17
相关论文
共 50 条
  • [21] On Efficient Training of Large-Scale Deep Learning Models
    Shen, Li
    Sun, Yan
    Yu, Zhiyuan
    Ding, Liang
    Tian, Xinmei
    Tao, Dacheng
    [J]. ACM Computing Surveys, 57 (03):
  • [22] Generative and Autoencoder Models for Large-Scale Mutivariate Unsupervised Anomaly Detection
    Ounassera, Nabila
    Rhanoui, Maryem
    Mikram, Mounia
    El Asri, Bouchra
    [J]. NETWORKING, INTELLIGENT SYSTEMS AND SECURITY, 2022, 237 : 45 - 58
  • [23] Large scale sequence alignment via efficient inference in generative models
    Mihir Mongia
    Chengze Shen
    Arash Gholami Davoodi
    Guillaume Marçais
    Hosein Mohimani
    [J]. Scientific Reports, 13
  • [24] Large scale sequence alignment via efficient inference in generative models
    Mongia, Mihir
    Shen, Chengze
    Davoodi, Arash Gholami
    Marcais, Guillaume
    Mohimani, Hosein
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [25] A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM
    Serban, Iulian Vlad
    Gupta, Varun
    Kochmar, Ekaterina
    Vu, Dung D.
    Belfer, Robert
    Pineau, Joelle
    Courville, Aaron
    Charlin, Laurent
    Bengio, Yoshua
    [J]. ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 387 - 392
  • [26] Large Language Models as Commonsense Knowledge for Large-Scale Task Planning
    Zhao, Zirui
    Lee, Wee Sun
    Hsu, David
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [27] An Efficient Document Retrieval for Korean Open-Domain Question Answering Based on ColBERT
    Kang, Byungha
    Kim, Yeonghwa
    Shin, Youhyun
    Mourtzis, Dimitris
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (24):
  • [28] ArchivalQA: A Large-scale Benchmark Dataset for Open-Domain Question Answering over Historical News Collections
    Wang, Jiexin
    Jatowt, Adam
    Yoshikawa, Masatoshi
    [J]. PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3025 - 3035
  • [29] Efficient Visualization Strategies for Large-Scale Finite Element Models
    Xu Liangyin
    Li Yunpeng
    Zhang Sheng
    Chen Biaosong
    [J]. JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2018, 18 (01)
  • [30] REQUIEM FOR LARGE-SCALE MODELS
    LEE, DB
    [J]. JOURNAL OF THE AMERICAN INSTITUTE OF PLANNERS, 1973, 39 (03): : 163 - 178