Question Type-Aware Debiasing for Test-Time Visual Question Answering Model Adaptation

被引:0
|
作者
Liu, Jin [1 ]
Xie, Jialong [1 ]
Zhou, Fengyu [1 ]
He, Shengfeng [2 ]
机构
[1] Shandong Univ, Sch Control Sci & Engn, Jinan 250061, Peoples R China
[2] Singapore Management Univ, Sch Comp & Informat Syst, Singapore 178902, Singapore
基金
新加坡国家研究基金会;
关键词
Test-time adaptation; visual question answering; language debiasing;
D O I
10.1109/TCSVT.2024.3410041
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In Visual Question Answering (VQA), addressing language prior bias, where models excessively rely on superficial correlations between questions and answers, is crucial. This issue becomes more pronounced in real-world applications with diverse domains and varied question-answer distributions during testing. To tackle this challenge, Test-time Adaptation (TTA) has emerged, allowing pre-trained VQA models to adapt using unlabeled test samples. Current state-of-the-art models select reliable test samples based on fixed entropy thresholds and employ self-supervised debiasing techniques. However, these methods struggle with diverse answer spaces linked to different question types and may fail to identify biased samples that still leverage relevant visual context. In this paper, we propose Question type-guided Entropy Minimization and Debiasing (QED) as a solution for test-time VQA model adaptation. Our approach involves adaptive entropy minimization based on question types to improve the identification of fine-grained and unreliable samples. Additionally, we generate negative samples for each test sample and label them as biased if their answer entropy change rate significantly differs from positive test samples, subsequently removing them. We evaluate our approach on two public benchmarks, VQA-CP v2, and VQA-CP v1, and achieve new state-of-the-art results, with overall accuracy rates of 48.13% and 46.18%, respectively.
引用
收藏
页码:10805 / 10816
页数:12
相关论文
共 50 条
  • [41] Category-Aware Test-Time Training Domain Adaptation
    Feng, Yangqin
    Xu, Xinxing
    Fu, Huazhu
    Wang, Yan
    Wang, Zizhou
    Zhen, Liangli
    Goh, Rick Siow Mong
    Liu, Yong
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 300 - 306
  • [42] VPA: Fully Test-Time Visual Prompt Adaptation
    Sun, Jiachen
    Ibrahim, Mark
    Hall, Melissa
    Evtimov, Ivan
    Mao, Z. Morley
    Ferrer, Cristian Canton
    Hazirbas, Caner
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5796 - 5806
  • [43] ENVQA: Improving Visual Question Answering model by enriching the visual feature
    Chowdhury, Souvik
    Soni, Badal
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 142
  • [44] Question-Aware Global-Local Video Understanding Network for Audio-Visual Question Answering
    Chen, Zailong
    Wang, Lei
    Wang, Peng
    Gao, Peng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 4109 - 4119
  • [45] Medical visual question answering based on question-type reasoning and semantic space constraint
    Wang, Meiling
    He, Xiaohai
    Liu, Luping
    Qing, Linbo
    Chen, Honggang
    Liu, Yan
    Ren, Chao
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2022, 131
  • [46] Time-Aware Representation Learning for Time-Sensitive Question Answering
    Son, Jungbin
    Oh, Alice
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 70 - 77
  • [47] SkillCLIP: Skill Aware Modality Fusion Visual Question Answering (Student Abstract)
    Naik, Atharva
    Butala, Yash Parag
    Vaikunthan, Navaneethan
    Kapoor, Raghav
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23592 - 23593
  • [48] R-VQA: A robust visual question answering model
    Chowdhury, Souvik
    Soni, Badal
    KNOWLEDGE-BASED SYSTEMS, 2025, 309
  • [49] A Symbolic-Neural Reasoning Model for Visual Question Answering
    Gao, Jingying
    Blair, Alan
    Pagnucco, Maurice
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [50] Flexible Sentence Analysis Model for Visual Question Answering Network
    Deng, Wei
    Wang, Jianming
    Wang, Shengbei
    Jin, Guanghao
    2018 2ND INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND BIOINFORMATICS (ICBEB 2018), 2018, : 89 - 95