Attentive Cross-Modal Fusion Network for RGB-D Saliency Detection

被引:21
|
作者
Liu, Di [1 ]
Zhang, Kao [1 ]
Chen, Zhenzhong [1 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Object detection; Saliency detection; Feature extraction; Fuses; Visualization; Computational modeling; Semantics; Cross-modal attention; residual attention; fusion refinement network; RGB-D salient object detection; OBJECT DETECTION; MODEL; DISPARITY; FIXATION;
D O I
10.1109/TMM.2020.2991523
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, an attentive cross-modal fusion (ACMF) network is proposed for RGB-D salient object detection. The proposed method selectively fuses features in a cross-modal manner and uses a fusion refinement module to fuse output features from different resolutions. Our attentive cross-modal fusion network is built based on residual attention. In each level of ResNet output, both the RGB and depth features are turned into an identity map and a weighted attention map. The identity map is reweighted by the attention map of the paired modality. Moreover, the lower level features with higher resolution are adopted to refine the boundary of detected targets. The entire architecture can be trained end-to-end. The proposed ACMF is compared with state-of-the-art methods on eight recent datasets. The results demonstrate that our model can achieve advanced performance on RGB-D salient object detection.
引用
收藏
页码:967 / 981
页数:15
相关论文
共 50 条
  • [31] RGB-D Salient Object Detection Based on Cross-Modal Fusion and Boundary Deformable Convolution Guidance
    Meng L.-B.
    Yuan M.-Y.
    Shi X.-H.
    Zhang L.
    Wu J.-H.
    Cheng F.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (11): : 3155 - 3166
  • [32] Feature interaction and two-stage cross-modal fusion for RGB-D salient object detection
    Yu, Ming
    Liu, Jiali
    Liu, Yi
    Yan, Gang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (02) : 4543 - 4556
  • [33] CCANet: A Collaborative Cross-Modal Attention Network for RGB-D Crowd Counting
    Liu, Yanbo
    Cao, Guo
    Shi, Boshan
    Hu, Yingxiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 154 - 165
  • [34] Disentangled Cross-Modal Transformer for RGB-D Salient Object Detection and Beyond
    Chen, Hao
    Shen, Feihong
    Ding, Ding
    Deng, Yongjian
    Li, Chao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1699 - 1709
  • [35] Joint Cross-Modal and Unimodal Features for RGB-D Salient Object Detection
    Huang, Nianchang
    Liu, Yi
    Zhang, Qiang
    Han, Jungong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 2428 - 2441
  • [36] A Cross-Modal Feature Fusion Model Based on ConvNeXt for RGB-D Semantic Segmentation
    Tang, Xiaojiang
    Li, Baoxia
    Guo, Junwei
    Chen, Wenzhuo
    Zhang, Dan
    Huang, Feng
    MATHEMATICS, 2023, 11 (08)
  • [37] RGB-D Saliency Detection by Multi-stream Late Fusion Network
    Chen, Hao
    Li, Youfu
    Su, Dan
    COMPUTER VISION SYSTEMS, ICVS 2017, 2017, 10528 : 459 - 468
  • [38] Cross-modal and multi-level feature refinement network for RGB-D salient object detection
    Gao, Yue
    Dai, Meng
    Zhang, Qing
    VISUAL COMPUTER, 2023, 39 (09): : 3979 - 3994
  • [39] Cross-modal and multi-level feature refinement network for RGB-D salient object detection
    Yue Gao
    Meng Dai
    Qing Zhang
    The Visual Computer, 2023, 39 : 3979 - 3994
  • [40] Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection
    Chen, Hao
    Li, Youfu
    Su, Dan
    PATTERN RECOGNITION, 2019, 86 : 376 - 385