A DIAGNOSTIC STUDY OF VISUAL QUESTION ANSWERING WITH ANALOGICAL REASONING

被引:1
|
作者
Huang, Ziqi [1 ]
Zhu, Hongyuan [2 ]
Sun, Ying [2 ]
Choi, Dongkyu [3 ]
Tan, Cheston [2 ]
Lim, Joo-Hwee [1 ,2 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
[2] ASTAR, I2R, Singapore, Singapore
[3] ASTAR, IHPC, Singapore, Singapore
关键词
analogical reasoning; visual reasoning; Visual Question Answering (VQA); synthetic dataset; benchmark;
D O I
10.1109/ICIP42928.2021.9506539
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The deep learning community has made rapid progress in low-level visual perception tasks such as object localization, detection and segmentation. However, for tasks such as Visual Question Answering (VQA) and visual language grounding that require high-level reasoning abilities, huge gaps still exist between artificial systems and human intelligence. In this work, we perform a diagnostic study on recent popular VQA in terms of analogical reasoning. We term it as Analogical VQA, where a system needs to reason on a group of images to find analogical relations among them in order to correctly answer a natural language question. To study the task in depth, we propose an initial diagnostic synthetic dataset CLEVR-Analogy, which tests a range of analogical reasoning abilities (e.g. reasoning on object attributes, spatial relationships, existence, and arithmetic analogies). We benchmark various recent state-of-the-art methods on our dataset and compare the results against human performance, and discover that existing systems fall shorts when facing analogical reasoning involving spatial relationships. The dataset and code will be publicly available to facilitate future research.
引用
收藏
页码:2463 / 2467
页数:5
相关论文
共 50 条
  • [31] A Spatial Hierarchical Reasoning Network for Remote Sensing Visual Question Answering
    Zhang, Zixiao
    Jiao, Licheng
    Li, Lingling
    Liu, Xu
    Chen, Puhua
    Liu, Fang
    Li, Yuxuan
    Guo, Zhicheng
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [32] Cross-modal Relational Reasoning Network for Visual Question Answering
    Chen, Hongyu
    Liu, Ruifang
    Peng, Bo
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3939 - 3948
  • [33] Cascade Reasoning Network for Text-based Visual Question Answering
    Liu, Fen
    Xu, Guanghui
    Wu, Qi
    Du, Qing
    Jia, Wei
    Tan, Mingkui
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4060 - 4069
  • [34] Medical Visual Question Answering via Conditional Reasoning and Contrastive Learning
    Liu, Bo
    Zhan, Li-Ming
    Xu, Li
    Wu, Xiao-Ming
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (05) : 1532 - 1545
  • [35] Hierarchical Multimodality Graph Reasoning for Remote Sensing Visual Question Answering
    Zhang, Han
    Wang, Keming
    Zhang, Laixian
    Wang, Bingshu
    Li, Xuelong
    [J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62
  • [36] DisAVR: Disentangled Adaptive Visual Reasoning Network for Diagram Question Answering
    Wang, Yaxian
    Wei, Bifan
    Liu, Jun
    Zhang, Lingling
    Wang, Jiaxin
    Wang, Qianying
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4812 - 4827
  • [37] Visual question answering method based on relational reasoning and gating mechanism
    Wang X.
    Chen Q.-H.
    Sun Q.
    Jia Y.-B.
    [J]. Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2022, 56 (01): : 36 - 46
  • [38] Learning Hierarchical Reasoning for Text-Based Visual Question Answering
    Li, Caiyuan
    Du, Qinyi
    Wang, Qingqing
    Jin, Yaohui
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III, 2021, 12893 : 305 - 316
  • [39] Hierarchical reasoning based on perception action cycle for visual question answering
    Mohamud, Safaa Abdullahi Moallim
    Jalali, Amin
    Lee, Minho
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 241
  • [40] Medical visual question answering based on question-type reasoning and semantic space constraint
    Wang, Meiling
    He, Xiaohai
    Liu, Luping
    Qing, Linbo
    Chen, Honggang
    Liu, Yan
    Ren, Chao
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2022, 131