Distilling the Knowledge of Large-scale Generative Models into Retrieval Models for Efficient Open-domain Conversation

被引:0
|
作者
Kim, Beomsu [1 ]
Seo, Seokjun [1 ]
Han, Seungju [1 ]
Erdenee, Enkhbayar [1 ]
Chang, Buru [1 ]
机构
[1] Hyperconnect, Seoul, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the remarkable performance of largescale generative models in open-domain conversation, they are known to be less practical for building real-time conversation systems due to high latency. On the other hand, retrieval models could return responses with much lower latency but show inferior performance to the large-scale generative models since the conversation quality is bounded by the pre-defined response set. To take advantage of both approaches, we propose a new training method called G2R (Generative-toRetrieval distillation) that preserves the efficiency of a retrieval model while leveraging the conversational ability of a large-scale generative model by infusing the knowledge of the generative model into the retrieval model. G2R consists of two distinct techniques of distillation: the data-level G2R augments the dialogue dataset with additional responses generated by the large-scale generative model, and the model-level G2R transfers the response quality score assessed by the generative model to the score of the retrieval model by the knowledge distillation loss. Through extensive experiments including human evaluation, we demonstrate that our retrieval-based conversation system trained with G2R shows a substantially improved performance compared to the baseline retrieval model while showing significantly lower inference latency than the largescale generative models.
引用
收藏
页码:3357 / 3373
页数:17
相关论文
共 50 条
  • [41] An Empirical Investigation of Online News Classification on an Open-Domain, Large-Scale and High-Quality Dataset in Vietnamese
    Khanh Quoc Tran
    Phap Ngoc Trinh
    Khoa Nguyen-Anh Tran
    An Tran-Hoai Le
    Luan Van Ha
    Kiet Van Nguyen
    [J]. NEW TRENDS IN INTELLIGENT SOFTWARE METHODOLOGIES, TOOLS AND TECHNIQUES, 2021, 337 : 367 - 379
  • [42] SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval
    Zhao, Tiancheng
    Lu, Xiaopeng
    Lee, Kyusong
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 565 - 575
  • [43] DIFFUSIONDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models
    Wang, Zijie J.
    Montoya, Evan
    Munechika, David
    Yang, Haoyang
    Hoover, Benjamin
    Chau, Duen Horng
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 893 - 911
  • [44] Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models
    Barmpis, Konstantinos
    Kolovos, Dimitrios S.
    [J]. JOURNAL OF OBJECT TECHNOLOGY, 2014, 13 (03): : 1 - 26
  • [45] An efficient method for approximating attractors in large-scale asynchronous Boolean models
    Van Giang, Trinh
    Hiraishi, Kunihiko
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 1820 - 1826
  • [46] An efficient cutting scheme for a section view of the large-scale infrastructure models
    Luo, Guoliang
    Wang, Rui
    Xiao, Meihua
    Yang, Hui
    Xiao, Qian
    Zeng, Jiangyou
    Liao, Chenghui
    [J]. COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2022, 37 (02) : 245 - 260
  • [47] Efficient estimation of large-scale spatial capture-recapture models
    Turek, Daniel
    Milleret, Cyril
    Ergon, Torbjorn
    Broseth, Henrik
    Dupont, Pierre
    Bischof, Richard
    De Valpine, Perry
    [J]. ECOSPHERE, 2021, 12 (02):
  • [48] Efficient parameterization of large-scale dynamic models based on relative measurements
    Schmiester, Leonard
    Schaelte, Yannik
    Froehlich, Fabian
    Hasenauer, Jan
    Weindl, Daniel
    [J]. BIOINFORMATICS, 2020, 36 (02) : 594 - 602
  • [49] Efficient Visualization of Large-Scale Oblique Photogrammetry Models in Unreal Engine
    Huo, Yuhao
    Yang, Anran
    Jia, Qingren
    Chen, Yebin
    He, Biao
    Li, Jun
    [J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2021, 10 (10)
  • [50] MixPipe: Efficient Bidirectional Pipeline Parallelism for Training Large-Scale Models
    Zhang, Weigang
    Zhou, Biyu
    Tang, Xuehai
    Wang, Zhaoxing
    Hu, Songlin
    [J]. 2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,