An Answer FeedBack Network for Visual Question Answering

被引:0
|
作者
Tian, Weidong [1 ]
Tian, Ruihua [1 ]
Zhao, Zhongqiu [1 ]
Ren, Quan [1 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/IJCNN54540.2023.10191079
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances have explored the power of transformer architecture in Visual Question Answering(VQA). However, most of the models suffer from misalignment of multimodal features, and they focus on unimportant image regions when answering the given questions. To address this, in this paper, we propose an Answer FeedBack Network (AFBN) to focus on image region features that are more beneficial for answering questions. The generate answers of the backbone network are again inputted into the network as feedback information. Then, we propose a FeedBack module (FB) to control the answer feedback. Additionally, we adopt the consistency loss function to reconstruct the image region features. By this function, the model can ensure the same of the image region features related to the question or answer. Extensive experiments on VQA-v2 benchmark dataset show that our method achieves better performance than the state-of-the-art methods.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Answer Distillation for Visual Question Answering
    Fang, Zhiwei
    Liu, Jing
    Tang, Qu
    Li, Yong
    Lu, Hanqing
    [J]. COMPUTER VISION - ACCV 2018, PT I, 2019, 11361 : 72 - 87
  • [2] More Than An Answer: Neural Pivot Network for Visual Question Answering
    Zhou, Yiyi
    Ji, Rongrong
    Su, Jinsong
    Wu, Yongjian
    Wu, Yunsheng
    [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 681 - 689
  • [3] Learning Answer Embeddings for Visual Question Answering
    Hu, Hexiang
    Chao, Wei-Lun
    Sha, Fei
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5428 - 5436
  • [4] Dual-decoder transformer network for answer grounding in visual question answering
    Zhu, Liangjun
    Peng, Li
    Zhou, Weinan
    Yang, Jielong
    [J]. PATTERN RECOGNITION LETTERS, 2023, 171 : 53 - 60
  • [5] Answer-Type Prediction for Visual Question Answering
    Kafle, Kushal
    Kanan, Christopher
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4976 - 4984
  • [6] Affective Visual Question Answering Network
    Ruwa, Nelson
    Mao, Qirong
    Wang, Liangjun
    Dong, Ming
    [J]. IEEE 1ST CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2018), 2018, : 170 - 173
  • [7] Question-aware prediction with candidate answer recommendation for visual question answering
    Kim, B.
    Kim, J.
    [J]. ELECTRONICS LETTERS, 2017, 53 (18) : 1244 - 1245
  • [8] RANKVQA: ANSWER RE-RANKING FOR VISUAL QUESTION ANSWERING
    Qiao, Yanyuan
    Yu, Zheng
    Liu, Jing
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [9] Visual Question Answering Method Based on Yes/No Feedback
    Deng, Wei
    Wang, Jianming
    Jin, Guanghao
    [J]. Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2020, 33 (11): : 1043 - 1053
  • [10] ANSWERING THE QUESTION OR QUESTIONING THE ANSWER?
    Robson, Debbie
    McNeill, Ann
    [J]. ADDICTION, 2018, 113 (03) : 407 - 409