A multi-scale contextual attention network for remote sensing visual question answering

被引:1
|
作者
Feng, Jiangfan [1 ]
Wang, Hui [1 ]
机构
[1] Chongqing Univ Posts & Telecommun, Sch Comp Sci & Technol, Chongqing 400065, Peoples R China
关键词
Remote sensing; Visual question answering (VQA); Cross-modal; Attention; Multi-scales;
D O I
10.1016/j.jag.2023.103641
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Remote sensing visual question answering (RSVQA) is a user-friendly method used for analyzing remote sensing images (RSIs) in various tasks. However, current methods often overlook geospatial objects, which possess a multi-scale representation and require contextual information. Furthermore, limited research has been conducted on modeling and reasoning the long-distance dependencies between entities, resulting in one-sided and inaccurate answer predictions. To overcome these limitations, we propose the Scale-Aware Multi-level Feature Pyramid Network (SAMFPN), which integrates contextual and multi-scale information using a Feature Pyramid Network (FPN) and Co-Attention mechanisms. The SAMFPN module incorporates a multilevel FPN to capture both global and local contextual information. Additionally, it introduces a Visual-Question Collaboration Fusion (VQCF) module that simultaneously embeds and learns visual and textual information. Our experimental results demonstrate the superior accuracy and robustness of our proposed model compared to existing models. These outcomes indicate that SAMFPN effectively captures multi-scale contextual information, making it a reliable solution for RSVQA tasks.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Mutual Attention Inception Network for Remote Sensing Visual Question Answering
    Zheng, Xiangtao
    Wang, Binqiang
    Du, Xingqian
    Lu, Xiaoqiang
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [2] Multi-Scale Progressive Attention Network for Video Question Answering
    Guo, Zhicheng
    Zhao, Jiaxuan
    Jiao, Licheng
    Liu, Xu
    Li, Lingling
    [J]. ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 973 - 978
  • [3] Multi-scale Relational Reasoning with Regional Attention for Visual Question Answering
    Ma, Yuntao
    Lu, Tong
    Wu, Yirui
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5642 - 5649
  • [4] Visual question answering model based on graph neural network and contextual attention
    Sharma, Himanshu
    Jalal, Anand Singh
    [J]. IMAGE AND VISION COMPUTING, 2021, 110
  • [5] A Spatial Hierarchical Reasoning Network for Remote Sensing Visual Question Answering
    Zhang, Zixiao
    Jiao, Licheng
    Li, Lingling
    Liu, Xu
    Chen, Puhua
    Liu, Fang
    Li, Yuxuan
    Guo, Zhicheng
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [6] A Multi-Scale Progressive Collaborative Attention Network for Remote Sensing Fusion Classification
    Ma, Wenping
    Li, Yating
    Zhu, Hao
    Ma, Haoxiang
    Jiao, Licheng
    Shen, Jianchao
    Hou, Biao
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 3897 - 3911
  • [7] Multi-scale attention fusion network for semantic segmentation of remote sensing images
    Wen, Zhiqiang
    Huang, Hongxu
    Liu, Shuai
    [J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (24) : 7909 - 7926
  • [8] Multi-scale relation reasoning for multi-modal Visual Question Answering
    Wu, Yirui
    Ma, Yuntao
    Wan, Shaohua
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 96
  • [9] Multi-scale network for remote sensing segmentation
    Wang, Gaihua
    Zhai, Qianyu
    Lin, Jinheng
    [J]. IET IMAGE PROCESSING, 2022, 16 (06) : 1742 - 1751
  • [10] VISUAL QUESTION ANSWERING IN REMOTE SENSING WITH CROSS-ATTENTION AND MULTIMODAL INFORMATION BOTTLENECK
    Songara, Jayesh
    Pande, Shivam
    Choudhury, Shabnam
    Banerjee, Biplab
    Velmurugan, Rajbabu
    [J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6278 - 6281