Dual Swin-transformer based mutual interactive network for RGB-D salient object detection

被引:9
|
作者
Zeng, Chao [1 ,4 ]
Kwong, Sam [2 ,3 ]
Ip, Horace [2 ]
机构
[1] Hubei Univ, Sch Artificial Intelligence, Wuhan, Peoples R China
[2] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[3] Lingnan Univ, Hong Kong, Peoples R China
[4] Hubei Univ, Key Lab Intelligent Sensing Syst & Secur, Minist Educ, Wuhan, Peoples R China
关键词
Salient object detection; RGB-D images; Swin-transformer; Self-attention; Gated modality attention; Dense connection; Edge supervision; FUSION; ATTENTION;
D O I
10.1016/j.neucom.2023.126779
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Depth information for RGB-D Salient Object Detection(SOD) is important and conventional deep models are usually relied on the CNN feature extractors. The long-range contextual dependencies, dense modeling on the saliency decoder, and multi-task learning assistance are usually ignored. In this work, we propose a Dual Swin-Transformer-based Mutual Interactive Network (DTMINet), aiming to learn contextualized, dense, and edge-aware features for RGB-D SOD. We adopt the Swin-Transformer as the visual backbone to extract contextualized features. A self-attention-based Cross-Modality Interaction module is proposed to strengthen the visual backbone for cross-modal interaction. In addition, a Gated Modality Attention module is designed for cross-modal fusion. At different decoding stages, enhanced with dense connections and progressively merge the multi-level encoding features with the proposed Dense Saliency Decoder. Considering the depth quality issue, a Skip Convolution module is introduced to provide guidance to the RGB modality for the saliency prediction. In addition, we add the edge prediction to the saliency predictor to regularize the learning process. Comprehensive experiments on five standard RGB-D SOD benchmark datasets over four evaluation metrics demonstrate the superiority of the proposed method.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Swin Transformer-Based Edge Guidance Network for RGB-D Salient Object Detection
    Wang, Shuaihui
    Jiang, Fengyi
    Xu, Boqian
    [J]. SENSORS, 2023, 23 (21)
  • [2] SwinSOD: Salient object detection using swin-transformer
    Wu, Shuang
    Zhang, Guangjian
    Liu, Xuefeng
    [J]. IMAGE AND VISION COMPUTING, 2024, 146
  • [3] GroupTransNet: Group transformer network for RGB-D salient object detection
    Fang, Xian
    Jiang, Mingfeng
    Zhu, Jinchao
    Shao, Xiuli
    Wang, Hongpeng
    [J]. NEUROCOMPUTING, 2024, 594
  • [4] TANet: Transformer-based asymmetric network for RGB-D salient object detection
    Liu, Chang
    Yang, Gang
    Wang, Shuo
    Wang, Hangxu
    Zhang, Yunhua
    Wang, Yutao
    [J]. IET COMPUTER VISION, 2023, 17 (04) : 415 - 430
  • [5] Transformer-based difference fusion network for RGB-D salient object detection
    Cui, Zhi-Qiang
    Wang, Feng
    Feng, Zheng-Yong
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (06)
  • [6] TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer Embedding Network
    Liu, Zhengyi
    Wang, Yuan
    Tu, Zhengzheng
    Xiao, Yun
    Tang, Bin
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4481 - 4490
  • [7] CATNet: A Cascaded and Aggregated Transformer Network for RGB-D Salient Object Detection
    Sun, Fuming
    Ren, Peng
    Yin, Bowen
    Wang, Fasheng
    Li, Haojie
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2249 - 2262
  • [8] SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection
    Liu, Zhengyi
    Tan, Yacheng
    He, Qian
    Xiao, Yun
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (07) : 4486 - 4497
  • [9] LIANet: Layer Interactive Attention Network for RGB-D Salient Object Detection
    Han, Yibo
    Wang, Liejun
    Du, Anyu
    Jiang, Shaochen
    [J]. IEEE ACCESS, 2022, 10 : 25435 - 25447
  • [10] Aggregate interactive learning for RGB-D salient object detection
    Wu, Jingyu
    Sun, Fuming
    Xu, Rui
    Meng, Jie
    Wang, Fasheng
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 195