Three-Stream Attention-Aware Network for RGB-D Salient Object Detection

被引:228
|
作者
Chen, Hao [1 ]
Li, Youfu [1 ]
机构
[1] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China
关键词
Three-stream; RGB-D; saliency detection; cross-modal crass-level attention; FUSION;
D O I
10.1109/TIP.2019.2891104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous RGB-D fusion systems based on convolutional neural networks typically employ a two-stream architecture, in which RGB and depth inputs are learned independently. The multi-modal fusion stage is typically performed by concatenating the deep features from each stream in the inference process. The traditional two-stream architecture might experience insufficient multi-modal fusion due to two following limitations: 1) the cross-modal complementarity is rarely studied in the bottom-up path, wherein we believe the cross-modal complements can be combined to learn new discriminative features to enlarge the RGB-D representation community and 2) the cross-modal channels are typically combined by undifferentiated concatenation, which appears ambiguous to selecting cross-modal complementary features. In this paper, we address these two limitations by proposing a novel three-stream attention-aware multi-modal fusion network. In the proposed architecture, a cross-modal distillation stream, accompanying the RGB-specific and depth-specific streams, is introduced to extract new RCB-D features in each level in the bottom-up path. Furthermore, the channel-wise attention mechanism is innovatively introduced to the cross-modal cross-level fusion problem to adaptively select complementary feature maps from each modality in each level. Extensive experiments report the effectiveness of the proposed architecture and the significant improvement over the state-ofthe-art RGB-D salient object detection methods.
引用
收藏
页码:2825 / 2835
页数:11
相关论文
共 50 条
  • [1] PATNet: Patch-to-pixel attention-aware transformer network for RGB-D and RGB-T salient object detection
    Jiang, Mingfeng
    Ma, Jianhua
    Chen, Jiatong
    Wang, Yaming
    Fang, Xian
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 291
  • [2] Attention-aware Cross-modal Cross-level Fusion Network for RGB-D Salient Object Detection
    Chen, Hao
    Li, You-Fu
    Su, Dan
    [J]. 2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 6821 - 6826
  • [3] Bilateral Attention Network for RGB-D Salient Object Detection
    Zhang, Zhao
    Lin, Zheng
    Xu, Jun
    Jin, Wen-Da
    Lu, Shao-Ping
    Fan, Deng-Ping
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 1949 - 1961
  • [4] Three-stream RGB-D salient object detection network based on cross-level and cross-modal dual-attention fusion
    Meng, Lingbing
    Yuan, Mengya
    Shi, Xuehan
    Liu, Qingqing
    Cheng, Fei
    Li, Lingli
    [J]. IET IMAGE PROCESSING, 2023, 17 (11) : 3292 - 3308
  • [5] Three-stream interaction decoder network for RGB-thermal salient object detection
    Huo, Fushuo
    Zhu, Xuegui
    Li, Bingheng
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 258
  • [6] Context-aware network for RGB-D salient object detection
    Liang, Fangfang
    Duan, Lijuan
    Ma, Wei
    Qiao, Yuanhua
    Miao, Jun
    Ye, Qixiang
    [J]. PATTERN RECOGNITION, 2021, 111
  • [7] Hybrid-Attention Network for RGB-D Salient Object Detection
    Chen, Yuzhen
    Zhou, Wujie
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (17):
  • [8] TCANet: three-stream coordinate attention network for RGB-D indoor semantic segmentation
    Jia, Weikuan
    Yan, Xingchao
    Liu, Qiaolian
    Zhang, Ting
    Dong, Xishang
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (01) : 1219 - 1230
  • [9] DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection
    Chen, Zuyao
    Cong, Runmin
    Xu, Qianqian
    Huang, Qingming
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7012 - 7024
  • [10] Depth-aware lightweight network for RGB-D salient object detection
    Ling, Liuyi
    Wang, Yiwen
    Wang, Chengjun
    Xu, Shanyong
    Huang, Yourui
    [J]. IET IMAGE PROCESSING, 2023, 17 (08) : 2350 - 2361