Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model

被引:4
|
作者
Chen, Xiaolin [1 ]
Song, Xuemeng [2 ]
Jing, Liqiang [2 ]
Li, Shuo [2 ]
Hu, Linmei [3 ]
Nie, Liqiang [4 ]
机构
[1] Shandong Univ, Sch Software, Joint SDU NTU Ctr Artificial Intelligence Res, Jinan, Peoples R China
[2] Shandong Univ, Sch Comp Sci & Technol, Jinan, Peoples R China
[3] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing, Peoples R China
[4] Harbin Inst Technol Shenzhen, Sch Comp Sci & Technol, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
Multimodal task-oriented dialog systems; text response generation; generative pretrained language model; dual knowledge selection;
D O I
10.1145/3606368
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text response generation for multimodal task-oriented dialog systems, which aims to generate the proper text response given the multimodal context, is an essential yet challenging task. Although existing efforts have achieved compelling success, they still suffer from two pivotal limitations: (1) overlook the benefit of generative pretraining and (2) ignore the textual context-related knowledge. To address these limitations, we propose a novel dual knowledge-enhanced generative pretrained language mode for multimodal task-oriented dialog systems (DKMD), consisting of three key components: dual knowledge selection, dual knowledge-enhanced context learning, and knowledge-enhanced response generation. To be specific, the dual knowledge selection component aims to select the related knowledge according to both textual and visual modalities of the given context. Thereafter, the dual knowledge-enhanced context learning component targets seamlessly, integrating the selected knowledge into the multimodal context learning from both global and local perspectives, where the cross-modal semantic relation is also explored. Moreover, the knowledge-enhanced response generation component comprises a revised BART decoder, where an additional dot-product knowledge-decoder attention sub-layer is introduced for explicitly utilizing the knowledge to advance the text response generation. Extensive experiments on a public dataset verify the superiority of the proposed DKMD over state-of-the-art competitors.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] KEPLET: Knowledge-Enhanced Pretrained Language Model with Topic Entity Awareness
    Li, Yichuan
    Han, Jialong
    Lee, Kyumin
    Ma, Chengyuan
    Yao, Benjamin
    Liu, Derek
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 6864 - 6876
  • [2] KERS: A Knowledge-Enhanced Framework for Recommendation Dialog Systems with Multiple Subgoals
    Zhang, Jun
    Yang, Yan
    Chen, Chengcai
    He, Liang
    Yu, Zhou
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1092 - 1101
  • [3] Improving Multiple Documents Grounded Goal-Oriented Dialog Systems via Diverse Knowledge Enhanced Pretrained Language Model
    Jang, Yunah
    Lee, Dongryeol
    Park, Hyungjoo
    Kang, Taegwan
    Lee, Hwanhee
    Bae, Hyunkyung
    June, Kyomin
    PROCEEDINGS OF THE SECOND DIALDOC WORKSHOP ON DOCUMENT-GROUNDED DIALOGUE AND CONVERSATIONAL QUESTION ANSWERING (DIALDOC 2022), 2022, : 136 - 141
  • [4] Dual Semantic Knowledge Composed Multimodal Dialog Systems
    Chen, Xiaolin
    Song, Xuemeng
    Wei, Yinwei
    Nie, Liqiang
    Chua, Tat-Seng
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1518 - 1527
  • [5] A survey on knowledge-enhanced multimodal learning
    Lymperaiou, Maria
    Stamou, Giorgos
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (10)
  • [6] Layerwised multimodal knowledge distillation for vision-language pretrained model
    Wang, Jin
    Liao, Dawei
    Zhang, You
    Xu, Dan
    Zhang, Xuejie
    NEURAL NETWORKS, 2024, 175
  • [7] Conversation and recommendation: knowledge-enhanced personalized dialog system
    Ming He
    Jiwen Wang
    Tianyu Ding
    Tong Shen
    Knowledge and Information Systems, 2023, 65 : 261 - 279
  • [8] Conversation and recommendation: knowledge-enhanced personalized dialog system
    He, Ming
    Wang, Jiwen
    Ding, Tianyu
    Shen, Tong
    KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (01) : 261 - 279
  • [9] Conversation and Recommendation: Knowledge-Enhanced Personalized Dialog System
    He, Ming
    Shen, Tong
    Dong, Ruihai
    WEB ENGINEERING, ICWE 2021, 2021, 12706 : 209 - 224
  • [10] Semantic-Electromagnetic Inversion With Pretrained Multimodal Generative Model
    Chen, Yanjin
    Zhang, Hongrui
    Ma, Jie
    Cui, Tie Jun
    del Hougne, Philipp
    Li, Lianlin
    ADVANCED SCIENCE, 2024, 11 (42)