GLM-Dialog: Noise-tolerant Pre-training for Knowledge-grounded Dialogue Generation

被引:0
|
作者
Zhang, Jing [1 ]
Zhang, Xiaokang [1 ]
Zhang-Li, Daniel [2 ]
Yu, Jifan [2 ]
Yao, Zijun [2 ]
Ma, Zeyao [3 ]
Xu, Yiqi [1 ]
Wang, Haohua [2 ]
Zhang, Xiaohan [4 ]
Lin, Nianyi [2 ]
Lu, Sunrui [2 ]
Li, Juanzi [2 ]
Tang, Jie [2 ]
机构
[1] Renmin Univ China, Beijing, Peoples R China
[2] Tsinghua Univ, Beijing, Peoples R China
[3] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
[4] ZHIPU AI, Beijing, Peoples R China
关键词
Dialogue System; Dialogue Evaluation; Large Language Model;
D O I
10.1145/3580305.3599832
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present GLM-Dialog, a large-scale language model (LLM) with 10B parameters capable of knowledge-grounded conversation in Chinese using a search engine to access the Internet knowledge. GLM-Dialog offers a series of applicable techniques for exploiting various external knowledge including both helpful and noisy knowledge, enabling the creation of robust knowledge-grounded dialogue LLMs with limited proper datasets. To evaluate the GLM-Dialog more fairly, we also propose a novel evaluation method to allow humans to converse with multiple deployed bots simultaneously and compare their performance implicitly instead of explicitly rating using multidimensional metrics. Comprehensive evaluations from automatic to human perspective demonstrate the advantages of GLM-Dialog comparing with existing open source Chinese dialogue models. We release both the model checkpoint and source code, and also deploy it as a WeChat application to interact with users(1). We offer our evaluation platform online(2) in an effort to prompt the development of open source models and reliable dialogue evaluation systems. All the source code is available on Github(3).
引用
收藏
页码:5564 / 5575
页数:12
相关论文
共 34 条
  • [1] KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation
    Chen, Wenhu
    Su, Yu
    Yan, Xifeng
    Wang, William Yang
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 8635 - 8648
  • [2] Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation
    Li, Yu
    Peng, Baolin
    Shen, Yelong
    Mao, Yi
    Liden, Lars
    Yu, Zhou
    Gao, Jianfeng
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 206 - 218
  • [3] Knowledge-Grounded Dialogue Generation with Pre-trained Language Models
    Zhao, Xueliang
    Wu, Wei
    Xu, Can
    Tao, Chongyang
    Zhao, Dongyan
    Yan, Rui
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3377 - 3390
  • [4] A Pre-training Strategy for Zero-Resource Response Selection in Knowledge-Grounded Conversations
    Tao, Chongyang
    Chen, Changyu
    Feng, Jiazhan
    Wen, Ji-Rong
    Yan, Rui
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4446 - 4457
  • [5] Approximation of Response Knowledge Retrieval in Knowledge-grounded Dialogue Generation
    Zheng, Wen
    Milic-Frayling, Natasa
    Zhou, Ke
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
  • [6] Knowledge-Grounded Dialogue Generation for Medical Conversations: A Survey
    Liu, Xiaoxiao
    Chang, Jian
    Zhang, Jian Jun
    [J]. 2023 27TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION, IV, 2023, : 409 - 413
  • [7] Zero-Resource Knowledge-Grounded Dialogue Generation
    Li, Linxiao
    Xu, Can
    Wu, Wei
    Zhao, Yufan
    Zhao, Xueliang
    Tao, Chongyang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [8] KINet: Incorporating Relevant Facts Into Knowledge-Grounded Dialog Generation
    Bai, Jiaqi
    Yang, Ze
    Yang, Jian
    Guo, Hongcheng
    Li, Zhoujun
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1213 - 1222
  • [9] Adaptive Posterior Knowledge Selection for Improving Knowledge-Grounded Dialogue Generation
    Wang, Weichao
    Gao, Wei
    Feng, Shi
    Chen, Ling
    Wang, Daling
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 1989 - 1998
  • [10] TopicKS: Topic-driven Knowledge Selection for Knowledge-grounded Dialogue Generation
    Wang, Shiquan
    Si, Yuke
    Wei, Xiao
    Wang, Longbiao
    Zhuang, Zhiqiang
    Zhang, Xiaowang
    Dang, Jianwu
    [J]. INTERSPEECH 2022, 2022, : 1121 - 1125