Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing

被引:0
|
作者
Bajaj, Goonmeet [1 ]
Bandyopadhyay, Bortik [1 ]
Schmidt, Daniel [2 ]
Maneriker, Pranav [1 ]
Myers, Christopher [3 ]
Parthasarathy, Srinivasan [1 ]
机构
[1] Ohio State Univ OSU, Columbus, OH 43210 USA
[2] Wright State Univ, Dayton, OH 45435 USA
[3] Air Force Res Lab AFRL, Wright Patterson AFB, OH USA
关键词
D O I
10.1109/CVPRW50498.2020.00201
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional Visual Question Answering (VQA) datasets typically contain questions related to the spatial information of objects, object attributes, or general scene questions. Recently, researchers have recognized the need to improve the balance of such datasets to reduce the system's dependency on memorized linguistic features and statistical biases, while aiming for enhanced visual understanding. However, it is unclear whether any latent patterns exist to quantify and explain these failures. As an initial step towards better quantifying our understanding of the performance of VQA models, we use a taxonomy of Knowledge Gaps (KGs) to tag questions with one or more types of KGs. Each KG describes the reasoning abilities needed to arrive at a resolution, and failure to resolve gaps indicates an absence of the required reasoning ability. After identifying KGs for each question, we examine the skew in the distribution of questions for each KG. We then introduce a targeted question generation model to reduce this skew, which allows us to generate new types of questions for an image.
引用
收藏
页码:1563 / 1566
页数:4
相关论文
共 50 条
  • [1] Knowledge enhancement and scene understanding for knowledge-based visual question answering
    Su, Zhenqiang
    Gou, Gang
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (03) : 2193 - 2208
  • [2] Knowledge enhancement and scene understanding for knowledge-based visual question answering
    Zhenqiang Su
    Gang Gou
    [J]. Knowledge and Information Systems, 2024, 66 : 2193 - 2208
  • [3] Logical Implications for Visual Question Answering Consistency
    Tascon-Morales, Sergio
    Marquez-Neila, Pablo
    Sznitman, Raphael
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6725 - 6735
  • [4] Learning Visual Knowledge Memory Networks for Visual Question Answering
    Su, Zhou
    Zhu, Chen
    Dong, Yinpeng
    Cai, Dongqi
    Chen, Yurong
    Li, Jianguo
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7736 - 7745
  • [5] Intent Identification for Knowledge Base Question Answering
    Dai, Feifei
    Feng, Chong
    Wang, Zhiqiang
    Pei, Yuxia
    Huang, Heyan
    [J]. 2017 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2017, : 96 - 99
  • [6] KVQA: Knowledge-Aware Visual Question Answering
    Shah, Sanket
    Mishra, Anand
    Yadati, Naganand
    Talukdar, Partha Pratim
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8876 - 8884
  • [7] Learning to Specialize with Knowledge Distillation for Visual Question Answering
    Mun, Jonghwan
    Lee, Kimin
    Shin, Jinwoo
    Han, Bohyung
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [8] Increasing Interpretability in Outside Knowledge Visual Question Answering
    Upravitelev, Max
    Krauss, Christopher
    Kuhlmann, Isabelle
    [J]. KNOWLEDGE MANAGEMENT IN ORGANISATIONS, KMO 2024, 2024, 2152 : 319 - 330
  • [9] Multimodal Knowledge Reasoning for Enhanced Visual Question Answering
    Hussain, Afzaal
    Maqsood, Ifrah
    Shahzad, Muhammad
    Fraz, Muhammad Moazam
    [J]. 2022 16TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS, SITIS, 2022, : 224 - 230
  • [10] INTERPRETABLE VISUAL QUESTION ANSWERING REFERRING TO OUTSIDE KNOWLEDGE
    Zhu, He
    Togo, Ren
    Ogawa, Takahiro
    Haseyama, Miki
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2140 - 2144