SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection

被引:163
|
作者
Liu, Zhengyi [1 ]
Tan, Yacheng [1 ]
He, Qian [1 ]
Xiao, Yun [2 ]
机构
[1] Anhui Univ, Sch Comp Sci & Technol, Key Lab Intelligent Comp & Signal Proc, Minist Educ, Hefei 230601, Peoples R China
[2] Anhui Univ, Sch Artificial Intelligence, Key Lab Intelligent Comp & Signal Proc, Minist Educ, Hefei 230601, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformer; salient object detection; RGB-D; RGB-T; multi-modality; NETWORK; IMAGE; MODEL;
D O I
10.1109/TCSVT.2021.3127149
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Convolutional neural networks (CNNs) are good at extracting contexture features within certain receptive fields, while transformers can model the global long-range dependency features. By absorbing the advantage of transformer and the merit of CNN, Swin Transformer shows strong feature representation ability. Based on it, we propose a crass-modality fusion model, SwinNet, for RGB-D and RGB-T salient object detection. It is driven by Swin Transformer to extract the hierarchical features, boosted by attention mechanism to bridge the gap between two modalities, and guided by edge information to sharp the contour of salient object. To be specific, two-stream Swin Transformer encoder first extracts multi-modality features, and then spatial alignment and channel re-calibration module is presented to optimize intra-level cross-modality features. To clarify the fumy boundary, edge-guided decoder achieves interlevel cross-modality fusion under the guidance of edge features. The proposed model outperforms the state-of-the-art models on RGB-D and RGB-T datasets, showing that it provides more insight into the cross-modality complementarily task.
引用
收藏
页码:4486 / 4497
页数:12
相关论文
共 50 条
  • [21] Adaptive interactive network for RGB-T salient object detection with double mapping transformer
    Dong, Feng
    Wang, Yuxuan
    Zhu, Jinchao
    Li, Yuehua
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (20) : 59169 - 59193
  • [22] FEATURE ENHANCEMENT AND FUSION FOR RGB-T SALIENT OBJECT DETECTION
    Sun, Fengming
    Zhang, Kang
    Yuan, Xia
    Zhao, Chunxia
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1300 - 1304
  • [23] TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer Embedding Network
    Liu, Zhengyi
    Wang, Yuan
    Tu, Zhengzheng
    Xiao, Yun
    Tang, Bin
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4481 - 4490
  • [24] CATNet: A Cascaded and Aggregated Transformer Network for RGB-D Salient Object Detection
    Sun, Fuming
    Ren, Peng
    Yin, Bowen
    Wang, Fasheng
    Li, Haojie
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2249 - 2262
  • [25] Revisiting Feature Fusion for RGB-T Salient Object Detection
    Zhang, Qiang
    Xiao, Tonglin
    Huang, Nianchang
    Zhang, Dingwen
    Han, Jungong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (05) : 1804 - 1818
  • [26] Scribble-Supervised RGB-T Salient Object Detection
    Liu, Zhengyi
    Huang, Xiaoshen
    Zhang, Guanghui
    Fang, Xianyong
    Wang, Linbo
    Tang, Bin
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2369 - 2374
  • [27] Enabling modality interactions for RGB-T salient object detection
    Zhang, Qiang
    Xi, Ruida
    Xiao, Tonglin
    Huang, Nianchang
    Luo, Yongjiang
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 222
  • [28] Depth-aware lightweight network for RGB-D salient object detection
    Ling, Liuyi
    Wang, Yiwen
    Wang, Chengjun
    Xu, Shanyong
    Huang, Yourui
    [J]. IET IMAGE PROCESSING, 2023, 17 (08) : 2350 - 2361
  • [29] DVSOD: RGB-D Video Salient Object Detection
    Li, Jingjing
    Ji, Wei
    Wang, Size
    Li, Wenbo
    Cheng, Li
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [30] RGB-D Salient Object Detection Using Saliency and Edge Reverse Attention
    Ikeda, Tomoki
    Ikehara, Masaaki
    [J]. IEEE ACCESS, 2023, 11 : 68818 - 68825