SEQ2SEQ++: A Multitasking-Based Seq2seq Model to Generate Meaningful and Relevant Answers

被引:4
|
作者
Palasundram, Kulothunkan [1 ]
Sharef, Nurfadhlina Mohd [1 ]
Kasmiran, Khairul Azhar [1 ]
Azman, Azreen [1 ]
机构
[1] Univ Putra Malaysia, Fac Comp Sci & Informat Technol, Intelligent Comp Res Grp, Seri Kembangan 43400, Selangor, Malaysia
来源
IEEE ACCESS | 2021年 / 9卷 / 09期
关键词
Task analysis; Chatbots; Computational modeling; Decoding; Training; Transformers; Benchmark testing; Sequence to sequence learning; natural answer generation; multitask learning; attention mechanism; ATTENTION; ENCODER;
D O I
10.1109/ACCESS.2021.3133495
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Question-answering chatbots have tremendous potential to complement humans in various fields. They are implemented using either rule-based or machine learning-based systems. Unlike the former, machine learning-based chatbots are more scalable. Sequence-to-sequence (Seq2Seq) learning is one of the most popular approaches in machine learning-based chatbots and has shown remarkable progress since its introduction in 2014. However, chatbots based on Seq2Seq learning show a weakness in that it tends to generate answers that can be generic and inconsistent with the questions, thereby becoming meaningless and, therefore, may lower the chatbot adoption rate. This weakness can be attributed to three issues: question encoder overfit, answer generation overfit, and language model influence. Several recent methods utilize multitask learning (MTL) to address this weakness. However, the existing MTL models show very little improvement over single-task learning, wherein they still generate generic and inconsistent answers. This paper presents a novel approach to MTL for the Seq2Seq learning model called SEQ2SEQ++, which comprises a multifunctional encoder, an answer decoder, an answer encoder, and a ternary classifier. Additionally, SEQ2SEQ++ utilizes a dynamic tasks loss weight mechanism for MTL loss calculation and a novel attention mechanism called the comprehensive attention mechanism. Experiments on NarrativeQA and SQuAD datasets were conducted to gauge the performance of the proposed model in comparison with two recently proposed models. The experimental results show that SEQ2SEQ++ yields noteworthy improvements over the two models on bilingual evaluation understudy, word error rate, and Distinct-2 metrics.
引用
收藏
页码:164949 / 164975
页数:27
相关论文
共 50 条
  • [1] Sketch-aae: A Seq2Seq Model to Generate Sketch Drawings
    Lu, Jia
    Li, Xueming
    Zhang, Xianlin
    ICVISP 2019: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING, 2019,
  • [2] A Chinese text corrector based on seq2seq model
    Gu, Sunyan
    Lang, Fei
    2017 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2017, : 322 - 325
  • [3] SparQL Query Prediction Based on Seq2Seq Model
    Yang D.-H.
    Zou K.-F.
    Wang H.-Z.
    Wang J.-B.
    Ruan Jian Xue Bao/Journal of Software, 2021, 32 (03): : 805 - 817
  • [4] Keyphrase Generation Based on Deep Seq2seq Model
    Zhang, Yong
    Xiao, Weidong
    IEEE ACCESS, 2018, 6 : 46047 - 46057
  • [5] WiFi Based Fingerprinting Positioning Based on Seq2seq Model
    Sun, Haotai
    Zhu, Xiaodong
    Liu, Yuanning
    Liu, Wentao
    SENSORS, 2020, 20 (13) : 1 - 19
  • [6] Neural Question Generation based on Seq2Seq
    Liu, Bingran
    2020 5TH INTERNATIONAL CONFERENCE ON MATHEMATICS AND ARTIFICIAL INTELLIGENCE (ICMAI 2020), 2020, : 119 - 123
  • [7] Residual Seq2Seq model for Building energy management
    Kim, Marie
    Kim, Nae-soo
    Song, YuJin
    Pyo, Cheol Sig
    2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1126 - 1128
  • [8] Automatic Generation of Pseudocode with Attention Seq2seq Model
    Xu, Shaofeng
    Xiong, Yun
    2018 25TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2018), 2018, : 711 - 712
  • [9] Sparsing and Smoothing for the seq2seq Models
    Zhao S.
    Liang Z.
    Wen J.
    Chen J.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (03): : 464 - 472
  • [10] Guesswork for Inference in Machine Translation with Seq2seq Model
    Liu, Lilian
    Malak, Derya
    Medard, Muriel
    2019 IEEE INFORMATION THEORY WORKSHOP (ITW), 2019, : 60 - 64