Attention-aware Cross-modal Cross-level Fusion Network for RGB-D Salient Object Detection

被引:0
|
作者
Chen, Hao [1 ]
Li, You-Fu [1 ,2 ]
Su, Dan [1 ]
机构
[1] City Univ Hong Kong, Dept Mech Engn, Kowloon, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Shenzhen Res Inst, Hong Kong, Peoples R China
关键词
MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks have achieved wide success in RGB saliency detection. Recently, the advent of RGB-D sensors such as Kinect provide additional geometric saliency cues. However, the key challenge for RGB-D salient object detection that how to fuse RGB and depth information sufficiently is still under-studied. Traditional works mainly follow the two-stream architecture and combine RGB and depth features/decisions in an early or late point. The multi-modal fusion stage is performed by directly concatenating the features from two modalities without selection. In this work, we address this question by proposing a novel network with a distinguished insight: A selection module is significantly helpful for more informative and sufficient cross-modal cross-level combination. To this end, we introduce a top-down RGB-D fusion network which integrates an attention-aware cross-modal cross-level fusion block in each level to select discriminative features from each level and each modality. Extensive experiments on public datasets show that the proposed network is able to solve the key problems in RGB-D fusion and achieves state-of-the-art performance on RGB-D salient object detection.
引用
收藏
页码:6821 / 6826
页数:6
相关论文
共 50 条
  • [21] Joint Cross-Modal and Unimodal Features for RGB-D Salient Object Detection
    Huang, Nianchang
    Liu, Yi
    Zhang, Qiang
    Han, Jungong
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 2428 - 2441
  • [22] Cross-modal attention fusion network for RGB-D semantic segmentation
    Zhao, Qiankun
    Wan, Yingcai
    Xu, Jiqian
    Fang, Lijin
    [J]. NEUROCOMPUTING, 2023, 548
  • [23] Feature Enhancement and Multi-scale Cross-Modal Attention for RGB-D Salient Object Detection
    Wan, Xin
    Yang, Gang
    Zhou, Boyi
    Liu, Chang
    Wang, Hangxu
    Wang, Yutao
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 409 - 420
  • [24] RGB-D Salient Object Detection Based on Cross-Modal Fusion and Boundary Deformable Convolution Guidance
    Meng, Ling-Bing
    Yuan, Meng-Ya
    Shi, Xue-Han
    Zhang, Le
    Wu, Jin-Hua
    Cheng, Fei
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (11): : 3155 - 3166
  • [25] Feature interaction and two-stage cross-modal fusion for RGB-D salient object detection
    Yu, Ming
    Liu, Jiali
    Liu, Yi
    Yan, Gang
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (02) : 4543 - 4556
  • [26] Attentive Cross-Modal Fusion Network for RGB-D Saliency Detection
    Liu, Di
    Zhang, Kao
    Chen, Zhenzhong
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 967 - 981
  • [27] A cross-modal edge-guided salient object detection for RGB-D image
    Liu, Zhengyi
    Wang, Kaixun
    Dong, Hao
    Wang, Yuan
    [J]. NEUROCOMPUTING, 2021, 454 : 168 - 177
  • [28] Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection
    Chen, Hao
    Li, Youfu
    Su, Dan
    [J]. PATTERN RECOGNITION, 2019, 86 : 376 - 385
  • [29] PATNet: Patch-to-pixel attention-aware transformer network for RGB-D and RGB-T salient object detection
    Jiang, Mingfeng
    Ma, Jianhua
    Chen, Jiatong
    Wang, Yaming
    Fang, Xian
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 291
  • [30] Cross-Modal Adaptation for RGB-D Detection
    Hoffman, Judy
    Gupta, Saurabh
    Leong, Jian
    Guadarrama, Sergio
    Darrell, Trevor
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2016, : 5032 - 5039