Resolving Zero-Shot and Fact-Based Visual Question Answering via Enhanced Fact Retrieval

被引:2
|
作者
Wu, Sen [1 ]
Zhao, Guoshuai [1 ,2 ]
Qian, Xueming [2 ,3 ,4 ]
机构
[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian 710049, Peoples R China
[2] Shaanxi Yulan Jiuzhou Intelligent Optoelect Techno, Xian 710049, Peoples R China
[3] Xi An Jiao Tong Univ, Sch Informat & Commun Engn, Key Lab Intelligent Networks & Network Secur, Minist Educ, Xian 710049, Peoples R China
[4] Xi An Jiao Tong Univ, SMILES LAB, Xian 710049, Peoples R China
基金
中国国家自然科学基金;
关键词
Visualization; Task analysis; Knowledge based systems; Question answering (information retrieval); Predictive models; Knowledge graphs; Feature extraction; Visual question answering; zero-shot; knowledge graph;
D O I
10.1109/TMM.2023.3289729
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Practical applications with visual question answering (VQA) systems are challenging, and recent research has aimed at investigating this important field. Many issues related to real-world VQA applications must be considered. Although existing methods have focused on adding external knowledge and other descriptive information to assist in reasoning, they are limited by the impact of information retrieval errors on downstream tasks and the misalignment of the aggregated information. Thus, the overall performance of these models must be improved. To address these challenges, we propose a novel VQA model that utilizes a differentiated pretrained model to represent the input information and connects the input data with three external knowledge components through a common feature space. To combine the information in the three feature spaces, we propose an information aggregation strategy that employs a weighted score to aggregate the information in the relation and entity spaces in the answer prediction process. The experimental results show that our method achieves good performance in fact-based and zero-shot VQA tasks and achieves state-of-the-art performance with the ZS-F-VQA dataset.
引用
收藏
页码:1790 / 1800
页数:11
相关论文
共 50 条
  • [1] FVQA: Fact-Based Visual Question Answering
    Wang, Peng
    Wu, Qi
    Shen, Chunhua
    Dick, Anthony
    van den Hengel, Anton
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (10) : 2413 - 2427
  • [2] Hierarchical Attention Networks for Fact-based Visual Question Answering
    Haibo Yao
    Yongkang Luo
    Zhi Zhang
    Jianhang Yang
    Chengtao Cai
    Multimedia Tools and Applications, 2024, 83 : 17281 - 17298
  • [3] Fact-based visual question answering via dual-process system
    Liu, Luping
    Wang, Meiling
    He, Xiaohai
    Qing, Linbo
    Chen, Honggang
    Knowledge-Based Systems, 2022, 237
  • [4] Fact-based visual question answering via dual-process system
    Liu, Luping
    Wang, Meiling
    He, Xiaohai
    Qing, Linbo
    Chen, Honggang
    KNOWLEDGE-BASED SYSTEMS, 2022, 237
  • [5] Hierarchical Attention Networks for Fact-based Visual Question Answering
    Yao, Haibo
    Luo, Yongkang
    Zhang, Zhi
    Yang, Jianhang
    Cai, Chengtao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (06) : 17281 - 17298
  • [6] FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering
    Lin, Weizhe
    Wang, Zhilin
    Byrne, Bill
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 149 - 157
  • [7] EKGRL: Entity-Based Knowledge Graph Representation Learning for Fact-Based Visual Question Answering
    Ren, Yongjian
    Chen, Xiaotang
    Huang, Kaiqi
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 485 - 496
  • [8] Zero-shot Visual Question Answering with Language Model Feedback
    Du, Yifan
    Li, Junyi
    Tang, Tianyi
    Zhao, Wayne Xin
    Wen, Ji-Rong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 9268 - 9281
  • [9] Zero-Shot Visual Question Answering Using Knowledge Graph
    Chen, Zhuo
    Chen, Jiaoyan
    Geng, Yuxia
    Pan, Jeff Z.
    Yuan, Zonggang
    Chen, Huajun
    SEMANTIC WEB - ISWC 2021, 2021, 12922 : 146 - 162
  • [10] Fact-based question decomposition in DeepQA
    Kalyanpur, A.
    Patwardhan, S.
    Boguraev, B. K.
    Lally, A.
    Chu-Carroll, J.
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2012, 56 (3-4)