SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection

被引:163
|
作者
Liu, Zhengyi [1 ]
Tan, Yacheng [1 ]
He, Qian [1 ]
Xiao, Yun [2 ]
机构
[1] Anhui Univ, Sch Comp Sci & Technol, Key Lab Intelligent Comp & Signal Proc, Minist Educ, Hefei 230601, Peoples R China
[2] Anhui Univ, Sch Artificial Intelligence, Key Lab Intelligent Comp & Signal Proc, Minist Educ, Hefei 230601, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformer; salient object detection; RGB-D; RGB-T; multi-modality; NETWORK; IMAGE; MODEL;
D O I
10.1109/TCSVT.2021.3127149
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Convolutional neural networks (CNNs) are good at extracting contexture features within certain receptive fields, while transformers can model the global long-range dependency features. By absorbing the advantage of transformer and the merit of CNN, Swin Transformer shows strong feature representation ability. Based on it, we propose a crass-modality fusion model, SwinNet, for RGB-D and RGB-T salient object detection. It is driven by Swin Transformer to extract the hierarchical features, boosted by attention mechanism to bridge the gap between two modalities, and guided by edge information to sharp the contour of salient object. To be specific, two-stream Swin Transformer encoder first extracts multi-modality features, and then spatial alignment and channel re-calibration module is presented to optimize intra-level cross-modality features. To clarify the fumy boundary, edge-guided decoder achieves interlevel cross-modality fusion under the guidance of edge features. The proposed model outperforms the state-of-the-art models on RGB-D and RGB-T datasets, showing that it provides more insight into the cross-modality complementarily task.
引用
收藏
页码:4486 / 4497
页数:12
相关论文
共 50 条
  • [31] Advancing in RGB-D Salient Object Detection: A Survey
    Chen, Ai
    Li, Xin
    He, Tianxiang
    Zhou, Junlin
    Chen, Duanbing
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [32] OBJECT-AWARE CALIBRATED DEPTH-GUIDED TRANSFORMER FOR RGB-D CO-SALIENT OBJECT DETECTION
    Wu, Yang
    Liang, Lingyan
    Zhao, Yaqian
    Zhang, Kaihua
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1121 - 1126
  • [33] Adaptive Fusion for RGB-D Salient Object Detection
    Wang, Ningning
    Gong, Xiaojin
    [J]. IEEE ACCESS, 2019, 7 : 55277 - 55284
  • [34] EDGE-Net: an edge-guided enhanced network for RGB-T salient object detection
    Zheng, Xin
    Wang, Boyang
    Ai, Liefu
    Tang, Pan
    Liu, Deyang
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (06)
  • [35] Hierarchical Decoding Network Based on Swin Transformer for Detecting Salient Objects in RGB-T Images
    Sun, Fan
    Zhou, Wujie
    Ye, Lv
    Yu, Lu
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1714 - 1718
  • [36] Disentangled Cross-Modal Transformer for RGB-D Salient Object Detection and Beyond
    Chen, Hao
    Shen, Feihong
    Ding, Ding
    Deng, Yongjian
    Li, Chao
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1699 - 1709
  • [37] TANet: Transformer-based asymmetric network for RGB-D salient object detection
    Liu, Chang
    Yang, Gang
    Wang, Shuo
    Wang, Hangxu
    Zhang, Yunhua
    Wang, Yutao
    [J]. IET COMPUTER VISION, 2023, 17 (04) : 415 - 430
  • [38] TSVT: Token Sparsification Vision Transformer for robust RGB-D salient object detection
    Gao, Lina
    Liu, Bing
    Fu, Ping
    Xu, Mingzhu
    [J]. PATTERN RECOGNITION, 2024, 148
  • [39] Modal complementary fusion network for RGB-T salient object detection
    Ma, Shuai
    Song, Kechen
    Dong, Hongwen
    Tian, Hongkun
    Yan, Yunhui
    [J]. APPLIED INTELLIGENCE, 2023, 53 (08) : 9038 - 9055
  • [40] Transformer-based difference fusion network for RGB-D salient object detection
    Cui, Zhi-Qiang
    Wang, Feng
    Feng, Zheng-Yong
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (06)