Salient Object Detection in Optical Remote Sensing Images Driven by Transformer

被引:15
|
作者
Li, Gongyang [1 ,2 ]
Bai, Zhen [1 ]
Liu, Zhi [1 ,2 ]
Zhang, Xinpeng [1 ]
Ling, Haibin [3 ]
机构
[1] Shanghai Univ, Joint Int Res Lab Specialty Fiber Opt & Adv Commun, Key Lab Specialty Fiber Opt & Opt Access Networks, Sch Commun & Informat Engn,Shanghai Inst Adv Commu, Shanghai 200444, Peoples R China
[2] Wenzhou Inst Shanghai Univ, Wenzhou 325000, Peoples R China
[3] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Feature extraction; Transformers; Optical imaging; Object detection; Remote sensing; Context modeling; Semantics; Salient object detection; optical remote sensing image; transformer; directional convolution; shuffle weighted spatial attention; NETWORK;
D O I
10.1109/TIP.2023.3314285
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing methods for Salient Object Detection in Optical Remote Sensing Images (ORSI-SOD) mainly adopt Convolutional Neural Networks (CNNs) as the backbone, such as VGG and ResNet. Since CNNs can only extract features within certain receptive fields, most ORSI-SOD methods generally follow the local-to-contextual paradigm. In this paper, we propose a novel Global Extraction Local Exploration Network (GeleNet) for ORSI-SOD following the global-to-local paradigm. Specifically, GeleNet first adopts a transformer backbone to generate four-level feature embeddings with global long-range dependencies. Then, GeleNet employs a Direction-aware Shuffle Weighted Spatial Attention Module (D-SWSAM) and its simplified version (SWSAM) to enhance local interactions, and a Knowledge Transfer Module (KTM) to further enhance cross-level contextual interactions. D-SWSAM comprehensively perceives the orientation information in the lowest-level features through directional convolutions to adapt to various orientations of salient objects in ORSIs, and effectively enhances the details of salient objects with an improved attention mechanism. SWSAM discards the direction-aware part of D-SWSAM to focus on localizing salient objects in the highest-level features. KTM models the contextual correlation knowledge of two middle-level features of different scales based on the self-attention mechanism, and transfers the knowledge to the raw features to generate more discriminative features. Finally, a saliency predictor is used to generate the saliency map based on the outputs of the above three modules. Extensive experiments on three public datasets demonstrate that the proposed GeleNet outperforms relevant state-of-the-art methods. The code and results of our method are available at https://github.com/MathLee/GeleNet.
引用
收藏
页码:5257 / 5269
页数:13
相关论文
共 50 条
  • [1] Adaptive Spatial Tokenization Transformer for Salient Object Detection in Optical Remote Sensing Images
    Gao, Lina
    Liu, Bing
    Fu, Ping
    Xu, Mingzhu
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [2] Bidirectional mutual guidance transformer for salient object detection in optical remote sensing images
    Huang, Kan
    Tian, Chunwei
    Li, Ge
    [J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (13) : 4016 - 4033
  • [3] Transformer guidance dual-stream network for salient object detection in optical remote sensing images
    Zhang, Yi
    Guo, Jichang
    Yue, Huihui
    Yin, Xiangjun
    Zheng, Sida
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (24): : 17733 - 17747
  • [4] Transformer with large convolution kernel decoder network for salient object detection in optical remote sensing images
    Dong, Pengwei
    Wang, Bo
    Cong, Runmin
    Sun, Hai-Han
    Li, Chongyi
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 240
  • [5] Transformer guidance dual-stream network for salient object detection in optical remote sensing images
    Yi Zhang
    Jichang Guo
    Huihui Yue
    Xiangjun Yin
    Sida Zheng
    [J]. Neural Computing and Applications, 2023, 35 : 17733 - 17747
  • [6] Adaptive Dual-Stream Sparse Transformer Network for Salient Object Detection in Optical Remote Sensing Images
    Zhao, Jie
    Jia, Yun
    Ma, Lin
    Yu, Lidan
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 5173 - 5192
  • [7] Aggregating transformers and CNNs for salient object detection in optical remote sensing images
    Bao, Liuxin
    Zhou, Xiaofei
    Zheng, Bolun
    Yin, Haibing
    Zhu, Zunjie
    Zhang, Jiyong
    Yan, Chenggang
    [J]. NEUROCOMPUTING, 2023, 553
  • [8] Adjacent Complementary Network for Salient Object Detection in Optical Remote Sensing Images
    Song, Dawei
    Dong, Yongsheng
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [9] Attention Guided Network for Salient Object Detection in Optical Remote Sensing Images
    Lin, Yuhan
    Sun, Han
    Liu, Ningzhong
    Bian, Yetong
    Cen, Jun
    Zhou, Huiyu
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT I, 2022, 13529 : 25 - 36
  • [10] ASNet: Adaptive Semantic Network Based on Transformer-CNN for Salient Object Detection in Optical Remote Sensing Images
    Yan, Ruixiang
    Yan, Longquan
    Geng, Guohua
    Cao, Yufei
    Zhou, Pengbo
    Meng, Yongle
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 16