Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing

被引:0
|
作者
Bajaj, Goonmeet [1 ]
Bandyopadhyay, Bortik [1 ]
Schmidt, Daniel [2 ]
Maneriker, Pranav [1 ]
Myers, Christopher [3 ]
Parthasarathy, Srinivasan [1 ]
机构
[1] Ohio State Univ OSU, Columbus, OH 43210 USA
[2] Wright State Univ, Dayton, OH 45435 USA
[3] Air Force Res Lab AFRL, Wright Patterson AFB, OH USA
关键词
D O I
10.1109/CVPRW50498.2020.00201
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional Visual Question Answering (VQA) datasets typically contain questions related to the spatial information of objects, object attributes, or general scene questions. Recently, researchers have recognized the need to improve the balance of such datasets to reduce the system's dependency on memorized linguistic features and statistical biases, while aiming for enhanced visual understanding. However, it is unclear whether any latent patterns exist to quantify and explain these failures. As an initial step towards better quantifying our understanding of the performance of VQA models, we use a taxonomy of Knowledge Gaps (KGs) to tag questions with one or more types of KGs. Each KG describes the reasoning abilities needed to arrive at a resolution, and failure to resolve gaps indicates an absence of the required reasoning ability. After identifying KGs for each question, we examine the skew in the distribution of questions for each KG. We then introduce a targeted question generation model to reduce this skew, which allows us to generate new types of questions for an image.
引用
收藏
页码:1563 / 1566
页数:4
相关论文
共 50 条
  • [21] Visual Question Answering based on multimodal triplet knowledge accumuation
    Wang, Fengjuan
    An, Gaoyun
    [J]. 2022 16TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP2022), VOL 1, 2022, : 81 - 84
  • [22] Passage Retrieval for Outside-Knowledge Visual Question Answering
    Qu, Chen
    Zamani, Hamed
    Yang, Liu
    Croft, W. Bruce
    Learned-Miller, Erik
    [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1753 - 1757
  • [23] A flexible testing environment for visual question answering with performance evaluation
    Cudic, Mihael
    Burt, Ryan
    Santana, Eder
    Principe, Jose C.
    [J]. NEUROCOMPUTING, 2018, 291 : 128 - 135
  • [24] VQA as a factoid question answering problem: A novel approach for knowledge-aware and explainable visual question answering
    Narayanan, Abhishek
    Rao, Abijna
    Prasad, Abhishek
    Natarajan, S.
    [J]. IMAGE AND VISION COMPUTING, 2021, 116
  • [25] Rich Visual Knowledge-Based Augmentation Network for Visual Question Answering
    Zhang, Liyang
    Liu, Shuaicheng
    Liu, Donghao
    Zeng, Pengpeng
    Li, Xiangpeng
    Song, Jingkuan
    Gao, Lianli
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (10) : 4362 - 4373
  • [26] VisKE: Visual Knowledge Extraction and Question Answering by Visual Verification of Relation Phrases
    Sadeghi, Fereshteh
    Divvala, Santosh K.
    Farhad, Ali
    [J]. 2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 1456 - 1464
  • [27] Bridging the Cross-Modality Semantic Gap in Visual Question Answering
    Wang, Boyue
    Ma, Yujian
    Li, Xiaoyan
    Gao, Junbin
    Hu, Yongli
    Yin, Baocai
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 13
  • [28] VisKoP: Visual Knowledge oriented Programming for Interactive Knowledge Base Question Answering
    Yao, Zijun
    Chen, Yuanyong
    Lv, Xin
    Cao, Shulin
    Xin, Amy
    Yu, Jifan
    Jin, Hailong
    Xu, Jianjun
    Zhang, Peng
    Hou, Lei
    Li, Juanzi
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-DEMO 2023, VOL 3, 2023, : 179 - 189
  • [29] Understanding asthma: Gaps in knowledge and their therapeutic implications
    Togias, AG
    [J]. AMERICAN JOURNAL OF MANAGED CARE, 2000, 6 (07): : S355 - S363
  • [30] An In-Context Schema Understanding Method for Knowledge Base Question Answering
    Liu, Yantao
    Li, Zixuan
    Jin, Xiaolong
    Guo, Yucan
    Bai, Long
    Guan, Saiping
    Guo, Jiafeng
    Cheng, Xueqi
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, KSEM 2024, 2024, 14884 : 419 - 434