EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training

被引:6
|
作者
Gu, Yuxian [1 ,2 ]
Wen, Jiaxin [1 ,2 ]
Sun, Hao [1 ,2 ]
Song, Yi [1 ,2 ]
Ke, Pei [1 ,2 ]
Zheng, Chujie [1 ,2 ]
Zhang, Zheng [1 ,2 ]
Yao, Jianzhu [2 ]
Liu, Lei [3 ]
Zhu, Xiaoyan [1 ,2 ]
Huang, Minlie [1 ,2 ]
机构
[1] Tsinghua Univ, Conversat AI Grp, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[3] York Univ, Dept Elect Engn & Comp Sci, Toronto, ON M3J 1P3, Canada
基金
美国国家科学基金会;
关键词
Natural language processing; deep learning (DL); large-scale pre-training; dialogue systems; Chinese open-domain conversational model;
D O I
10.1007/s11633-022-1387-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems. However, previous works mainly focus on showing and evaluating the conversational performance of the released dialogue model, ignoring the discussion of some key factors towards a powerful human-like chatbot, especially in Chinese scenarios. In this paper, we conduct extensive experiments to investigate these under-explored factors, including data quality control, model architecture designs, training approaches, and decoding strategies. We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available. Automatic and human evaluations show that EVA2.0 significantly outperforms other open-source counterparts. We also discuss the limitations of this work by presenting some failure cases and pose some future research directions on large-scale Chinese open-domain dialogue systems.
引用
收藏
页码:207 / 219
页数:13
相关论文
共 50 条
  • [1] Relation-Guided Pre-Training for Open-Domain Question Answering
    Hu, Ziniu
    Sun, Yizhou
    Chang, Kai-Wei
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 3431 - 3448
  • [2] Pre-training on Large-Scale Heterogeneous Graph
    Jiang, Xunqiang
    Jia, Tianrui
    Fang, Yuan
    Shi, Chuan
    Lin, Zhe
    Wang, Hui
    [J]. KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 756 - 766
  • [3] Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models
    Bae, Sanghwan
    Kwak, Donghyun
    Kim, Sungdong
    Ham, Donghoon
    Kang, Soyoung
    Lee, Sang-Woo
    Park, Woomyoung
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2128 - 2150
  • [4] A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM
    Serban, Iulian Vlad
    Gupta, Varun
    Kochmar, Ekaterina
    Vu, Dung D.
    Belfer, Robert
    Pineau, Joelle
    Courville, Aaron
    Charlin, Laurent
    Bengio, Yoshua
    [J]. ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 387 - 392
  • [5] WuDaoCorpora: A super large-scale Chinese corpora for pre-training language models
    Yuan, Sha
    Zhao, Hanyu
    Du, Zhengxiao
    Ding, Ming
    Liu, Xiao
    Cen, Yukuo
    Zou, Xu
    Yang, Zhilin
    Tang, Jie
    [J]. AI OPEN, 2021, 2 : 65 - 68
  • [6] Investigating Evaluation of Open-Domain Dialogue Systems With Human Generated Multiple References
    Gupta, Prakhar
    Mehri, Shikib
    Zhao, Tiancheng
    Pavel, Amy
    Eskenazi, Maxine
    Bigham, Jeffrey P.
    [J]. 20TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2019), 2019, : 379 - 391
  • [7] Synthetic Augmentation with Large-Scale Unconditional Pre-training
    Ye, Jiarong
    Ni, Haomiao
    Jin, Peng
    Huang, Sharon X.
    Xue, Yuan
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT II, 2023, 14221 : 754 - 764
  • [8] Hyperlink-induced Pre-training for Passage Retrieval in Open-domain Question Answering
    Zhou, Jiawei
    Li, Xiaoguang
    Shang, Lifeng
    Luo, Lan
    Zhan, Ke
    Hu, Enrui
    Zhang, Xinyu
    Jiang, Hao
    Cao, Zhao
    Yu, Fan
    Jiang, Xin
    Liu, Qun
    Chen, Lei
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7135 - 7146
  • [9] PreDet: Large-scale weakly supervised pre-training for detection
    Ramanathan, Vignesh
    Wang, Rui
    Mahajan, Dhruv
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2845 - 2855
  • [10] An Optimized Method for Large-Scale Pre-Training in Symbolic Music
    Liu, Shike
    Xu, Hongguang
    Xu, Ke
    [J]. Proceedings of the International Conference on Anti-Counterfeiting, Security and Identification, ASID, 2022, 2022-December : 105 - 109