Enhancing Multi-modal Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation

被引:1
|
作者
Yang, Qian [1 ]
Chen, Qian
Wang, Wen
Hu, Baotian [1 ]
Zhang, Min [1 ]
机构
[1] Harbin Inst Technol, Shenzhen, Peoples R China
关键词
Question Answering; Cross-modal Reasoning; Multi-modal Retrieval;
D O I
10.1145/3581783.3611964
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-modal multi-hop question answering involves answering a question by reasoning over multiple input sources from different modalities. Existing methods often retrieve evidences separately and then use a language model to generate an answer based on the retrieved evidences, and thus do not adequately connect candidates and are unable to model the interdependent relations during retrieval. Moreover, the pipelined approaches of retrieval and generation might result in poor generation performance when retrieval performance is low. To address these issues, we propose a Structured Knowledge and Unified Retrieval-Generation (SKURG) approach. SKURG employs an Entity-centered Fusion Encoder to align sources from different modalities using shared entities. It then uses a unified Retrieval-Generation Decoder to integrate intermediate retrieval results for answer generation and also adaptively determine the number of retrieval steps. Extensive experiments on two representative multi-modal multi-hop QA datasets MultimodalQA and WebQA demonstrate that SKURG outperforms the state-of-the-art models in both source retrieval and answer generation performance with fewer parameters(1).
引用
收藏
页码:5223 / 5234
页数:12
相关论文
共 50 条
  • [21] Multi-hop knowledge graph question answering based on deformed graph matching
    Li X.
    Fang Q.
    Hu J.
    Qian S.
    Xu C.
    [J]. Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (02): : 529 - 534
  • [22] Multi-hop Knowledge Base Question Answering with an Iterative Sequence Matching Model
    Lan, Yunshi
    Wang, Shuohang
    Jiang, Jing
    [J]. 2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 359 - 368
  • [23] A dynamic graph expansion network for multi-hop knowledge base question answering
    Wu, Wenqing
    Zhu, Zhenfang
    Qi, Jiangtao
    Wang, Wenling
    Zhang, Guangyuan
    Liu, Peiyu
    [J]. NEUROCOMPUTING, 2023, 515 : 37 - 47
  • [24] Multi-hop Question Answering with Knowledge Graph Embedding in a Similar Semantic Space
    Li, Fengying
    Chen, Mingdong
    Dong, Rongsheng
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [25] Counterfactual-Augmented Data for Multi-Hop Knowledge Base Question Answering
    Li, Yingting
    [J]. WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, : 719 - 720
  • [26] Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering
    Feng, Yanlin
    Chen, Xinyue
    Lin, Bill Yuchen
    Wang, Peifeng
    Yan, Jun
    Ren, Xiang
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1295 - 1309
  • [27] Knowledge Graph Relation Path Network for Multi-Hop Intelligent Question Answering
    Zhang Y.-M.
    Ji Q.
    Xu X.-S.
    Cheng Z.-B.
    Xiao G.
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (11): : 3092 - 3099
  • [28] Hierarchical Graph Network for Multi-hop Question Answering
    Fang, Yuwei
    Sun, Siqi
    Gan, Zhe
    Pillai, Rohit
    Wang, Shuohang
    Liu, Jingjing
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 8823 - 8838
  • [29] Is Graph Structure Necessary for Multi-hop Question Answering?
    Shao, Nan
    Cui, Yiming
    Liu, Ting
    Wang, Shijin
    Hu, Guoping
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7187 - 7192
  • [30] Repurposing Entailment for Multi-Hop Question Answering Tasks
    Trivedi, Harsh
    Kwon, Heeyoung
    Khot, Tushar
    Sabharwal, Ashish
    Balasubramanian, Niranjan
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2948 - 2958