Graphhopper: Multi-hop Scene Graph Reasoning for Visual Question Answering

被引:12
|
作者
Koner, Rajat [1 ]
Li, Hang [1 ,2 ]
Hildebrandt, Marcel [1 ,2 ]
Das, Deepan [3 ]
Tresp, Volker [1 ,2 ]
Guennemann, Stephan [3 ]
机构
[1] Ludwig Maximilian Univ Munich, Munich, Germany
[2] Siemens AG, Munich, Germany
[3] Tech Univ Munich, Munich, Germany
来源
SEMANTIC WEB - ISWC 2021 | 2021年 / 12922卷
关键词
Visual Question Answering (VQA); Knowledge graph reasoning; Scene graph reasoning; Multi-modal reasoning; Reinforcement learning; LANGUAGE;
D O I
10.1007/978-3-030-88361-4_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual Question Answering (VQA) is concerned with answering free-form questions about an image. Since it requires a deep semantic and linguistic understanding of the question and the ability to associate it with various objects that are present in the image, it is an ambitious task and requires multi-modal reasoning from both computer vision and natural language processing. We propose Graphhopper, a novel method that approaches the task by integrating knowledge graph reasoning, computer vision, and natural language processing techniques. Concretely, our method is based on performing context-driven, sequential reasoning based on the scene entities and their semantic and spatial relationships. As a first step, we derive a scene graph that describes the objects in the image, as well as their attributes and their mutual relationships. Subsequently, a reinforcement learning agent is trained to autonomously navigate in a multi-hop manner over the extracted scene graph to generate reasoning paths, which are the basis for deriving answers. We conduct an experimental study on the challenging dataset GQA, based on both manually curated and automatically generated scene graphs. Our results show that we keep up with human performance on manually curated scene graphs. Moreover, we find that Graphhopper outperforms another state-of-the-art scene graph reasoning model on both manually curated and automatically generated scene graphs by a significant margin.
引用
收藏
页码:111 / 127
页数:17
相关论文
共 50 条
  • [1] Multi-Hop Reasoning for Question Answering with Knowledge Graph
    Zhang, Jiayuan
    Cai, Yifei
    Zhang, Qian
    Cao, Zehao
    Cheng, Zhenrong
    Li, Dongmei
    Meng, Xianghao
    [J]. 2021 IEEE/ACIS 20TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS 2021-SUMMER), 2021, : 121 - 125
  • [2] A question-guided multi-hop reasoning graph network for visual question answering
    Xu, Zhaoyang
    Gu, Jinguang
    Liu, Maofu
    Zhou, Guangyou
    Fu, Haidong
    Qiu, Chen
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (02)
  • [3] Breadth First Reasoning Graph for Multi-hop Question Answering
    Huang, Yongjie
    Yang, Meng
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5810 - 5821
  • [4] Coarse and Fine Granularity Graph Reasoning for Interpretable Multi-Hop Question Answering
    Zhang, Min
    Li, Feng
    Wang, Yang
    Zhang, Zequn
    Zhou, Yanhai
    Li, Xiaoyu
    [J]. IEEE ACCESS, 2020, 8 : 56755 - 56765
  • [5] Hierarchical Graph Network for Multi-hop Question Answering
    Fang, Yuwei
    Sun, Siqi
    Gan, Zhe
    Pillai, Rohit
    Wang, Shuohang
    Liu, Jingjing
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 8823 - 8838
  • [6] Is Graph Structure Necessary for Multi-hop Question Answering?
    Shao, Nan
    Cui, Yiming
    Liu, Ting
    Wang, Shijin
    Hu, Guoping
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7187 - 7192
  • [7] Dynamic Semantic Graph Construction and Reasoning for Explainable Multi-hop Science Question Answering
    Xu, Weiwen
    Zhang, Huihui
    Cai, Deng
    Lam, Wai
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1044 - 1056
  • [8] Unrestricted multi-hop reasoning network for interpretable question answering over knowledge graph
    Bi, Xin
    Nie, Haojie
    Zhang, Xiyu
    Zhao, Xiangguo
    Yuan, Ye
    Wang, Guoren
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 243
  • [9] Exploiting Reasoning Chains for Multi-hop Science Question Answering
    Xu, Weiwen
    Deng, Yang
    Zhang, Huihui
    Cai, Deng
    Lam, Wai
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1143 - 1156
  • [10] Text Reasoning Chain Extraction for Multi-Hop Question Answering
    Wang, Pengming
    Zhu, Zijiang
    Chen, Qing
    Dai, Weihuang
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2024, 29 (04) : 959 - 970