DAST: Depth-Aware Assessment and Synthesis Transformer for RGB-D Salient Object Detection

被引:1
|
作者
Xia, Chenxing [1 ]
Duan, Songsong [1 ]
Fang, Xianjin [1 ]
Ge, Bin [1 ]
Gao, Xiuju [1 ]
Cui, Jianhua [2 ]
机构
[1] Anhui Univ Sci & Technol, Huainan, Anhui, Peoples R China
[2] China Tobacco Henan Ind Co Ltd, Anyang Cigarette Factory, Anyang, Henan, Peoples R China
基金
美国国家科学基金会; 安徽省自然科学基金;
关键词
Salient object detection; Swin transformer; Low-quality; Depth map; Assessment and synthesis; NETWORK;
D O I
10.1007/978-3-031-20865-2_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The introduction and popularity of depth maps have brought new vitality and growth into salient object detection (SOD), and plentiful RGB-D SOD methods have been proposed, which mainly focus on how to utilize and integrate the depth map. Although existing methods have achieved promising performance, the negative effects of lowquality depth maps have not been effectively addressed. In this paper, we solve the problem with a strategy of judging low-quality depth maps and assigning low factors to low-quality depth maps. To this end, we proposed a novel Transformer-based SOD framework, namely Depth-aware Assessment and Synthesis Transformer (DAST), to further improve the performance of RGB-D SOD. The proposed DAST involves two primary designs: 1) a Swin Transformer-based encoder is employed instead of a convolutional neural network for more effective feature extraction and long-range dependencies capture; 2) a Depth Assessment and Synthesis (DAS) module is proposed to judge the quality of depth maps and fuse the multi-modality salient features by computing the difference of saliency maps from RGB and depth streams in a coarse-to-fine manner. Extensive experiments on five benchmark datasets demonstrate that the proposed DAST achieves favorable performance as compared with other state-of-the-art (SOTA) methods.
引用
收藏
页码:473 / 487
页数:15
相关论文
共 50 条
  • [21] TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer Embedding Network
    Liu, Zhengyi
    Wang, Yuan
    Tu, Zhengzheng
    Xiao, Yun
    Tang, Bin
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4481 - 4490
  • [22] CATNet: A Cascaded and Aggregated Transformer Network for RGB-D Salient Object Detection
    Sun, Fuming
    Ren, Peng
    Yin, Bowen
    Wang, Fasheng
    Li, Haojie
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2249 - 2262
  • [23] HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness
    Wu, Zongwei
    Allibert, Guillaume
    Meriaudeau, Fabrice
    Ma, Chao
    Demonceaux, Cedric
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 2160 - 2173
  • [24] Depth cue enhancement and guidance network for RGB-D salient object detection
    Li, Xiang
    Zhang, Qing
    Yan, Weiqi
    Dai, Meng
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 95
  • [25] Synergizing triple attention with depth quality for RGB-D salient object detection
    Song, Peipei
    Li, Wenyu
    Zhong, Peiyan
    Zhang, Jing
    Konuisz, Piotr
    Duan, Feng
    Barnes, Nick
    [J]. NEUROCOMPUTING, 2024, 589
  • [26] MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer
    Huang, Kuan-Chih
    Wu, Tsung-Han
    Su, Hung-Ting
    Hsu, Winston H.
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4002 - 4011
  • [27] PATNet: Patch-to-pixel attention-aware transformer network for RGB-D and RGB-T salient object detection
    Jiang, Mingfeng
    Ma, Jianhua
    Chen, Jiatong
    Wang, Yaming
    Fang, Xian
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 291
  • [28] Disentangled Cross-Modal Transformer for RGB-D Salient Object Detection and Beyond
    Chen, Hao
    Shen, Feihong
    Ding, Ding
    Deng, Yongjian
    Li, Chao
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1699 - 1709
  • [29] TANet: Transformer-based asymmetric network for RGB-D salient object detection
    Liu, Chang
    Yang, Gang
    Wang, Shuo
    Wang, Hangxu
    Zhang, Yunhua
    Wang, Yutao
    [J]. IET COMPUTER VISION, 2023, 17 (04) : 415 - 430
  • [30] TSVT: Token Sparsification Vision Transformer for robust RGB-D salient object detection
    Gao, Lina
    Liu, Bing
    Fu, Ping
    Xu, Mingzhu
    [J]. PATTERN RECOGNITION, 2024, 148