Visual Grounding in Remote Sensing Images

被引:23
|
作者
Sun, Yuxi [1 ]
Feng, Shanshan [1 ]
Li, Xutao [1 ]
Ye, Yunming [1 ]
Kang, Jian [2 ]
Huang, Xu [1 ]
机构
[1] Harbin Inst Technol, Shenzhen, Peoples R China
[2] Soochow Univ, Suzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
dataset; object retrieval; visual grounding; remote sensing; referring expression;
D O I
10.1145/3503161.3548316
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Ground object retrieval from a large-scale remote sensing image is very important for lots of applications. We present a novel problem of visual grounding in remote sensing images. Visual grounding aims to locate the particular objects (in the form of the bounding box or segmentation mask) in an image by a natural language expression. The task already exists in the computer vision community. However, existing benchmark datasets and methods mainly focus on natural images rather than remote sensing images. Compared with natural images, remote sensing images contain large-scale scenes and the geographical spatial information of ground objects (e.g., longitude, latitude). The existing method cannot deal with these challenges. In this paper, we collect a new visual grounding dataset, called RSVG, and design a new method, namely GeoVG. In particular, the proposed method consists of a language encoder, image encoder, and fusion module. The language encoder is used to learn numerical geospatial relations and represent a complex expression as a geospatial relation graph. The image encoder is applied to learn large-scale remote sensing scenes with adaptive region attention. The fusion module is used to fuse the text and image feature for visual grounding. We evaluate the proposed method by comparing it to the state-of-the-art methods on RSVG. Experiments show that our method outperforms the previous methods on the proposed datasets. https://sunyuxi.github.io/publication/GeoVG
引用
收藏
页数:9
相关论文
共 50 条
  • [1] A Regionally Indicated Visual Grounding Network for Remote Sensing Images
    Hang, Renlong
    Xu, Siqi
    Liu, Qingshan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [2] Improving visual grounding in remote sensing images with adaptive modality guidance
    Choudhury, Shabnam
    Kurkure, Pratham
    Banerjee, Biplab
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2025, 224
  • [3] Language-Guided Progressive Attention for Visual Grounding in Remote Sensing Images
    Li, Ke
    Wang, Di
    Xu, Haojie
    Zhong, Haodi
    Wang, Cong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 1
  • [4] Visual grounding of remote sensing images with multi-dimensional semantic-guidance
    Ding, Yueli
    Wang, Di
    Li, Ke
    Zhao, Xiaohong
    Wang, Yifeng
    PATTERN RECOGNITION LETTERS, 2025, 189 : 85 - 91
  • [5] Multistage Synergistic Aggregation Network for Remote Sensing Visual Grounding
    Wang, Fuyan
    Wu, Chunlei
    Wu, Jie
    Wang, Leiquan
    Li, Canwei
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [6] RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data
    Zhan, Yang
    Xiong, Zhitong
    Yuan, Yuan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [7] CrossVG: Visual Grounding in Remote Sensing with Modality-Guided Interactions
    Choudhury, Shabnam
    Kurkure, Pratham
    Taiwan, Priyanka
    Banerjee, Biplab
    IGARSS 2024-2024 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, IGARSS 2024, 2024, : 2858 - 2862
  • [8] Language Query-Based Transformer With Multiscale Cross-Modal Alignment for Visual Grounding on Remote Sensing Images
    Lan, Meng
    Rong, Fu
    Jiao, Hongzan
    Gao, Zhi
    Zhang, Lefei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 13
  • [9] VISUAL QUESTION ANSWERING FROM REMOTE SENSING IMAGES
    Lobry, Sylvain
    Murray, Jesse
    Marcos, Diego
    Tuia, Devis
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 4951 - 4954
  • [10] Visual Question Generation From Remote Sensing Images
    Bashmal, Laila
    Bazi, Yakoub
    Melgani, Farid
    Ricci, Riccardo
    Al Rahhal, Mohamad M.
    Zuair, Mansour
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 3279 - 3293