Deconfounded Visual Question Generation with Causal Inference

被引:1
|
作者
Chen, Jiali [1 ]
Guo, Zhenjun [1 ]
Xie, Jiayuan [1 ]
Cai, Yi [1 ]
Li, Qing [2 ]
机构
[1] South China Univ Technol, Guangzhou, Peoples R China
[2] Hong Kong Polytech Univ, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
visual question generation; causal inference; knowledge-guided;
D O I
10.1145/3581783.3612536
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual Question Generation (VQG) task aims to generate meaningful and logically reasonable questions about the given image targeting an answer. Existing methods mainly focus on the visual concepts present in the image for question generation and have shown remarkable performance in VQG. However, these models frequently learn highly co-occurring object relationships and attributes, which is an inherent bias in question generation. This previously overlooked bias causes models to over-exploit the spurious correlations among visual features, the target answer, and the question. Therefore, they may generate inappropriate questions that contradict the visual content or facts. In this paper, we first introduce a causal perspective on VQG and adopt the causal graph to analyze spurious correlations among variables. Building on the analysis, we propose a Knowledge Enhanced Causal Visual Question Generation (KECVQG) model to mitigate the impact of spurious correlations in question generation. Specifically, an interventional visual feature extractor (IVE) is introduced in KECVQG, which aims to obtain unbiased visual features by disentangling. Then a knowledge-guided representation extractor (KRE) is employed to align unbiased features with external knowledge. Finally, the output features from KRE are sent into a standard transformer decoder to generate questions. Extensive experiments on the VQA v2.0 and OKVQA datasets show that KECVQG significantly outperforms existing models.
引用
收藏
页码:5132 / 5142
页数:11
相关论文
共 50 条
  • [41] The Causal Effects of Causal Inference Pedagogy
    Swanson, Sonja A. A.
    [J]. EPIDEMIOLOGY, 2023, 34 (05) : 611 - 613
  • [42] Visual Question Generation for Class Acquisition of Unknown Objects
    Uehara, Kohei
    Tejero-De-Pablos, Antonio
    Ushiku, Yoshitaka
    Harada, Tatsuya
    [J]. COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 492 - 507
  • [43] Modular Visual Question Answering via Code Generation
    Subramanian, Sanjay
    Narasimhan, Medhini
    Khangaonkar, Kushal
    Yang, Kevin
    Nagrani, Arsha
    Schmid, Cordelia
    Zeng, Andy
    Darrell, Trevor
    Klein, Dan
    [J]. 61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 747 - 761
  • [44] Multiple Objects-Aware Visual Question Generation
    Xie, Jiayuan
    Cai, Yi
    Huang, Qingbao
    Wang, Tao
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4546 - 4554
  • [45] Visual Question Generation From Remote Sensing Images
    Bashmal, Laila
    Bazi, Yakoub
    Melgani, Farid
    Ricci, Riccardo
    Al Rahhal, Mohamad M.
    Zuair, Mansour
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 3279 - 3293
  • [46] Radial Graph Convolutional Network for Visual Question Generation
    Xu, Xing
    Wang, Tan
    Yang, Yang
    Hanjalic, Alan
    Shen, Heng Tao
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (04) : 1654 - 1667
  • [47] ConVQG: Contrastive Visual Question Generation with Multimodal Guidance
    Mi, Li
    Montariol, Syrielle
    Castillo-Navarro, Javiera
    Dai, Xianjie
    Bosselut, Antoine
    Tuia, Devis
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4207 - 4215
  • [48] Stable and Causal Inference for Discriminative Self-supervised Deep Visual Representations
    Yang, Yuewei
    Li, Hai
    Chen, Yiran
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16063 - 16074
  • [49] Assignment-Control Plots: A Visual Companion for Causal Inference Study Design
    Aikens, Rachael C.
    Baiocchi, Michael
    [J]. AMERICAN STATISTICIAN, 2023, 77 (01): : 72 - 84
  • [50] Monkeys and humans implement causal inference to simultaneously localize auditory and visual stimuli
    Mohl, Jeff T.
    Pearson, John M.
    Groh, Jennifer M.
    [J]. JOURNAL OF NEUROPHYSIOLOGY, 2020, 124 (03) : 715 - 727