MambaSOD: Dual Mamba-driven cross-modal fusion network for RGB-D Salient Object Detection

被引:0
|
作者
Zhan, Yue [2 ]
Zeng, Zhihong [1 ,3 ]
Liu, Haijun [3 ]
Tan, Xiaoheng [3 ]
Tian, Yinli [4 ]
机构
[1] Institute of Interdisciplinary Studies, Guangdong Polytechnic Normal University, Guangzhou, China
[2] Department of Electrical and Electronic Engineering, the University of Hong Kong, Hong Kong
[3] School of Microelectronics and Communication Engineering, Chongqing University, Chongqing,400044, China
[4] School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing,400065, China
基金
中国国家自然科学基金;
关键词
Modal analysis - Object detection - Object recognition;
D O I
10.1016/j.neucom.2025.129718
中图分类号
学科分类号
摘要
The purpose of RGB-D Salient Object Detection (SOD) is to pinpoint the most visually conspicuous areas within images accurately. Numerous conventional models heavily rely on CNN and overlook the long-range contextual dependencies, subsequent transformer-based models have addressed the issue to some extent but introduce quadratic computational complexity. Moreover, incorporating spatial information from depth maps has been proven effective for this task and the primary challenge is how to effectively fuse the complementary information from RGB and depth. Recent advancements in Mamba, particularly its superior ability to perform long-range modeling within linear efficiency, have motivated our exploration of its potential in the RGB-D SOD task. In this paper, we propose a dual Mamba-driven cross-modal fusion network for RGB-D SOD, named MambaSOD, which effectively leverages Mamba's long-range dependency modeling capability. Specifically, we employ a dual Mamba-driven feature extractor to process RGB and depth inputs to obtain features with global contextual information. Then, we design a cross-modal fusion Mamba to perform modality-specific feature enhancement and model the inter-modal correlation between the RGB and depth features. To the best of our knowledge, this work is an innovative attempt to explore the potential of the pure Mamba in the RGB-D SOD task, offering a novel perspective. Numerous experiments conducted on seven prevailing datasets demonstrate our method's superiority over eighteen state-of-the-art RGB-D SOD models. The source code will be released at https://github.com/YueZhan721/MambaSOD. © 2025 Elsevier B.V.
引用
收藏
相关论文
共 50 条
  • [1] RGB-D salient object detection with asymmetric cross-modal fusion
    Yu M.
    Xing Z.-H.
    Liu Y.
    Kongzhi yu Juece/Control and Decision, 2023, 38 (09): : 2487 - 2495
  • [2] Cross-Modal Fusion and Progressive Decoding Network for RGB-D Salient Object Detection
    Hu, Xihang
    Sun, Fuming
    Sun, Jing
    Wang, Fasheng
    Li, Haojie
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (08) : 3067 - 3085
  • [3] Cross-modal hierarchical interaction network for RGB-D salient object detection
    Bi, Hongbo
    Wu, Ranwan
    Liu, Ziqi
    Zhu, Huihui
    Zhang, Cong
    Xiang, Tian -Zhu
    PATTERN RECOGNITION, 2023, 136
  • [4] A cross-modal adaptive gated fusion generative adversarial network for RGB-D salient object detection
    Liu, Zhengyi
    Zhang, Wei
    Zhao, Peng
    NEUROCOMPUTING, 2020, 387 : 210 - 220
  • [5] Depth Enhanced Cross-Modal Cascaded Network for RGB-D Salient Object Detection
    Zhao, Zhengyun
    Huang, Ziqing
    Chai, Xiuli
    Wang, Jun
    NEURAL PROCESSING LETTERS, 2023, 55 (01) : 361 - 384
  • [6] Lightweight cross-modal transformer for RGB-D salient object detection
    Huang, Nianchang
    Yang, Yang
    Zhang, Qiang
    Han, Jungong
    Huang, Jin
    Computer Vision and Image Understanding, 2024, 249
  • [7] Depth Enhanced Cross-Modal Cascaded Network for RGB-D Salient Object Detection
    Zhengyun Zhao
    Ziqing Huang
    Xiuli Chai
    Jun Wang
    Neural Processing Letters, 2023, 55 : 361 - 384
  • [8] Attention-aware Cross-modal Cross-level Fusion Network for RGB-D Salient Object Detection
    Chen, Hao
    Li, You-Fu
    Su, Dan
    2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 6821 - 6826
  • [9] RGB-D Salient Object Detection Based on Cross-modal Interactive Fusion and Global Awareness
    Sun F.-M.
    Hu X.-H.
    Wu J.-Y.
    Sun J.
    Wang F.-S.
    Ruan Jian Xue Bao/Journal of Software, 2024, 35 (04): : 1899 - 1913
  • [10] RGB-D Salient Object Detection Based on Cross-Modal and Cross-Level Feature Fusion
    Peng, Yanbin
    Zhai, Zhinian
    Feng, Mingkun
    IEEE ACCESS, 2024, 12 : 45134 - 45146