HRVQA: A Visual Question Answering benchmark for high-resolution aerial images

被引：0

作者：

Li, Kun ^{[1
]}

Vosselman, George ^{[1
]}

Yang, Michael Ying ^{[2
]}

机构：

[1] Univ Twente, Fac Geoinformat Sci & Earth Observat ITC, Enschede, Netherlands

[2] Univ Bath, Dept Comp Sci, Visual Comp Grp, Bath, England

来源：

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING | 2024年 / 214卷

关键词：

Visual question answering; High-resolution aerial images; Transformers; Benchmark dataset; LANGUAGE;

D O I：

10.1016/j.isprsjprs.2024.06.002

中图分类号：

P9 [自然地理学];

学科分类号：

0705 ; 070501 ;

摘要：

Visual question answering (VQA) is an important and challenging multimodal task in computer vision and photogrammetry. Recently, efforts have been made to bring the VQA task to aerial images, due to its potential real -world applications in disaster monitoring, urban planning, and digital earth product generation. However, the development of VQA in this domain is restricted by the huge variation in the appearance, scale, and orientation of the concepts in aerial images, along with the scarcity of well -annotated datasets. In this paper, we introduce a new dataset, HRVQA, which provides a collection of 53,512 aerial images of 1024 x 1024 pixels and semi -automatically generated 1,070,240 QA pairs. To benchmark the understanding capability of VQA models for aerial images, we evaluate the recent methods on the HRVQA dataset. Moreover, we propose a novel model, GFTransformer, with gated attention modules and a mutual fusion module. The experiments show that the proposed dataset is quite challenging, especially the specific attribute -related questions. Our method achieves superior performance in comparison to the previous state-of-the-art approaches. The dataset and the source code are released at https://hrvqa.nl/.

引用

页码：65 / 81

页数：17

共 50 条

[21] Deep learning for region detection in high-resolution aerial images
Khryashchev, Vladimir V.
Priorov, Andrey
Pavlov, Vladimir A.
Ostrovskaya, Anna A.
PROCEEDINGS OF 2018 IEEE EAST-WEST DESIGN & TEST SYMPOSIUM (EWDTS 2018), 2018,
[22] Automatic Georeferencing of Aerial Images Using Stereo High-Resolution Satellite Images
Oh, Jaehong
Toth, Charles K.
Grejner-Brzezinska, Dorota A.
PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2011, 77 (11): : 1157 - 1168
[23] Extraction of Water Bodies from High-Resolution Aerial and Satellite Images Using Visual Foundation Models
Ozdemir, Samed
Akbulut, Zeynep
Karsli, Fevzi
Kavzoglu, Taskin
SUSTAINABILITY, 2024, 16 (07)
[24] A high-speed feature matching method of high-resolution aerial images
Peng, Zhiyong
Wu, Jun
Zhang, Yongjun
Lin, Xianhua
JOURNAL OF REAL-TIME IMAGE PROCESSING, 2021, 18 (03) : 705 - 722
[25] A high-speed feature matching method of high-resolution aerial images
Zhiyong Peng
Jun Wu
Yongjun Zhang
Xianhua Lin
Journal of Real-Time Image Processing, 2021, 18 : 705 - 722
[26] A New Visual Question Answering System for Medical images characterization
Bghiel, Afrae
Dahdouh, Yousra
Allaouzi, Imane
Ben Ahmed, Mohamed
Anouar Boudhir, Abdelhakim
4TH INTERNATIONAL CONFERENCE ON SMART CITY APPLICATIONS (SCA' 19), 2019,
[27] CIRCUITVQA: A Visual Question Answering Dataset for Electrical Circuit Images
Mehta, Rahul
Singh, Bhavyajeet
Varma, Vasudeva
Gupta, Manish
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT I, ECML PKDD 2024, 2024, 14941 : 440 - 460
[28] Visual7W: Grounded Question Answering in Images
Zhu, Yuke
Groth, Oliver
Bernstein, Michael
Li Fei-Fei
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4995 - 5004
[29] SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images
Tanaka, Ryota
Nishida, Kyosuke
Nishida, Kosuke
Hasegawa, Taku
Saito, Itsumi
Saito, Kuniko
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13636 - 13645
[30] Localization and Grading of Building Roof Damages in High-Resolution Aerial Images
Boege, Melanie
Bulatov, Dimitri
Lucks, Lukas
COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2019), 2020, 1182 : 497 - 519

← 1 2 3 4 5 →