Coherent Dialogue with Attention-Based Language Models

被引:0
|
作者
Mei, Hongyuan [1 ]
Bansal, Mohit [2 ]
Walter, Matthew R. [3 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] Univ N Carolina, Chapel Hill, NC USA
[3] TTI Chicago, Chicago, IL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We model coherent conversation continuation via RNN-based dialogue models equipped with a dynamic attention mechanism. Our attention-RNN language model dynamically increases the scope of attention on the history as the conversation continues, as opposed to standard attention (or alignment) models with a fixed input scope in a sequence-to-sequence model. This allows each generated word to be associated with the most relevant words in its corresponding conversation history. We evaluate the model on two popular dialogue datasets, the open-domain MovieTriples dataset and the closed-domain Ubuntu Troubleshoot dataset, and achieve significant improvements over the state-of-the-art and baselines on several metrics, including complementary diversity-based metrics, human evaluation, and qualitative visualizations. We also show that a vanilla RNN with dynamic attention outperforms more complex memory models (e.g., LSTM and GRU) by allowing for flexible, long-distance memory. We promote further coherence via topic modeling-based reranking.
引用
收藏
页码:3252 / 3258
页数:7
相关论文
共 50 条
  • [1] Unseen Filler Generalization In Attention-based Natural Language Reasoning Models
    Chen, Chin-Hui
    Fu, Yi-Fu
    Cheng, Hsiao-Hua
    Lin, Shou-De
    [J]. 2020 IEEE SECOND INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2020), 2020, : 42 - 51
  • [2] Visual Interrogation of Attention-Based Models for Natural Language Inference and Machine Comprehension
    Liu, Shusen
    Li, Tao
    Li, Zhimin
    Srikumar, Vivek
    Pascucci, Valerio
    Bremer, Peer-Timo
    [J]. CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2018, : 36 - 41
  • [3] Learning the Dyck Language with Attention-based Seq2Seq Models
    Yu, Xiang
    Ngoc Thang Vu
    Kuhn, Jonas
    [J]. BLACKBOXNLP WORKSHOP ON ANALYZING AND INTERPRETING NEURAL NETWORKS FOR NLP AT ACL 2019, 2019, : 138 - 146
  • [4] Attention-Based Models for Speech Recognition
    Chorowski, Jan
    Bahdanau, Dzmitry
    Serdyuk, Dmitriy
    Cho, Kyunghyun
    Bengio, Yoshua
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [5] Attention-based Natural Language Person Retrieval
    Zhou, Tao
    Chen, Muhao
    Yu, Jie
    Terzopoulos, Demetri
    [J]. 2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 27 - 34
  • [6] Attention-based Emotion-assisted Sentiment Forecasting in Dialogue
    Zou, Congrui
    Yin, Yunfei
    Huang, Faliang
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [7] Interpreting Recurrent and Attention-Based Neural Models: a Case Study on Natural Language Inference
    Ghaeini, Reza
    Fern, Xiaoli Z.
    Tadepalli, Prasad
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4952 - 4957
  • [8] A Survey on Attention-Based Models for Image Captioning
    Osman, Asmaa A. E.
    Shalaby, Mohamed A. Wahby
    Soliman, Mona M.
    Elsayed, Khaled M.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (02) : 403 - 412
  • [9] Towards Understanding Attention-Based Speech Recognition Models
    Qin, Chu-Xiong
    Qu, Dan
    [J]. IEEE ACCESS, 2020, 8 : 24358 - 24369
  • [10] Rethinking Mobile Block for Efficient Attention-based Models
    Zhang, Jiangning
    Li, Xiangtai
    Li, Jian
    Liu, Liang
    Xue, Zhucun
    Zhang, Boshen
    Jiang, Zhengkai
    Huang, Tianxin
    Wang, Yabiao
    Wang, Chengjie
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 1389 - 1400