Learning consensus-aware semantic knowledge for remote sensing image captioning

被引：6

作者：

Li, Yunpeng ^{[1
]}

Zhang, Xiangrong ^{[1
]}

Cheng, Xina ^{[1
]}

Tang, Xu ^{[1
]}

Jiao, Licheng ^{[1
]}

机构：

[1] Xidian Univ, Key Lab Intelligent Percept & Image Understanding, Minist Educ, Xian 710071, Shaanxi, Peoples R China

来源：

PATTERN RECOGNITION | 2024年 / 145卷

基金：

中国国家自然科学基金;

关键词：

Cross-modal understanding; Visual-semantic interaction; Remote sensing image captioning; Graph convolutional network;

D O I：

10.1016/j.patcog.2023.109893

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Tremendous progresses have been made in remote sensing image captioning (RSIC) task in recent years, yet there still some unresolved problems: (1) facing the gap between the visual features and semantic concepts, (2) reasoning the higher-level relationships between semantic concepts. In this work, we focus on injecting high-level visual-semantic interaction into RSIC model. Firstly, the semantic concept extractor (SCE), end-to end trainable, precisely captures the semantic concepts contained in the RSIs. In particular, the visual-semantic co-attention (VSCA) is designed to grain coarse concept-related regions and region-related concepts for multi modal interaction. Furthermore, we incorporate the two types of attentive vectors with semantic-level relational features into a consensus exploitation (CE) block for learning cross-modal consensus-aware knowledge. The experiments on three benchmark data sets show the superiority of our approach compared with the reference methods.

引用

下载

页数：12

共 50 条

[41] FALSE: False Negative Samples Aware Contrastive Learning for Semantic Segmentation of High-Resolution Remote Sensing Image
Zhang, Zhaoyang
Wang, Xuying
Mei, Xiaoming
Tao, Chao
Li, Haifeng
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[42] Knowledge-Aware Text-Image Retrieval for Remote Sensing Images
Mi, Li
Dai, Xianjie
Castillo-Navarro, Javiera
Tuia, Devis
IEEE Transactions on Geoscience and Remote Sensing, 2024, 62
[43] Geographic knowledge graph-guided remote sensing image semantic segmentation
Li Y.
Wu K.
Ouyang S.
Yang K.
Li H.
Zhang Y.
National Remote Sensing Bulletin, 2024, 28 (02) : 455 - 469
[44] Feature refinement and rethinking attention for remote sensing image captioning
Yunpeng Li
Chengjin Tao
Meng Liu
Xiangrong Zhang
Guanchun Wang
Tianyang Zhang
Dong Zhao
Dabao Wang
Scientific Reports, 15 (1)
[45] Region-guided transformer for remote sensing image captioning
Zhao, Kai
Xiong, Wei
INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2024, 17 (01)
[46] Exploring Transformer and Multilabel Classification for Remote Sensing Image Captioning
Kandala, Hitesh
Saha, Sudipan
Banerjee, Biplab
Zhu, Xiao Xiang
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[47] REMOTE SENSING IMAGE CAPTIONING WITH SVM-BASED DECODING
Hoxha, Genc
Melgani, Farid
IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 6734 - 6737
[48] Sound Active Attention Framework for Remote Sensing Image Captioning
Lu, Xiaoqiang
Wang, Binqiang
Zheng, Xiangtao
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (03): : 1985 - 2000
[49] Truncation Cross Entropy Loss for Remote Sensing Image Captioning
Li, Xuelong
Zhang, Xueting
Huang, Wei
Wang, Qi
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (06): : 5246 - 5257
[50] Multiscale Methods for Optical Remote-Sensing Image Captioning
Ma, Xiaofeng
Zhao, Rui
Shi, Zhenwei
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (11) : 2001 - 2005

← 1 2 3 4 5 →