MambaSOD: Dual Mamba-driven cross-modal fusion network for RGB-D Salient Object Detection

被引：0

作者：

Zhan, Yue ^{[2
]}

Zeng, Zhihong ^{[1
,3
]}

Liu, Haijun ^{[3
]}

Tan, Xiaoheng ^{[3
]}

Tian, Yinli ^{[4
]}

机构：

[1] Institute of Interdisciplinary Studies, Guangdong Polytechnic Normal University, Guangzhou, China

[2] Department of Electrical and Electronic Engineering, the University of Hong Kong, Hong Kong

[3] School of Microelectronics and Communication Engineering, Chongqing University, Chongqing,400044, China

[4] School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing,400065, China

来源：

Neurocomputing | 2025年 / 631卷

基金：

中国国家自然科学基金;

关键词：

Modal analysis - Object detection - Object recognition;

D O I：

10.1016/j.neucom.2025.129718

中图分类号：

学科分类号：

摘要：

The purpose of RGB-D Salient Object Detection (SOD) is to pinpoint the most visually conspicuous areas within images accurately. Numerous conventional models heavily rely on CNN and overlook the long-range contextual dependencies, subsequent transformer-based models have addressed the issue to some extent but introduce quadratic computational complexity. Moreover, incorporating spatial information from depth maps has been proven effective for this task and the primary challenge is how to effectively fuse the complementary information from RGB and depth. Recent advancements in Mamba, particularly its superior ability to perform long-range modeling within linear efficiency, have motivated our exploration of its potential in the RGB-D SOD task. In this paper, we propose a dual Mamba-driven cross-modal fusion network for RGB-D SOD, named MambaSOD, which effectively leverages Mamba's long-range dependency modeling capability. Specifically, we employ a dual Mamba-driven feature extractor to process RGB and depth inputs to obtain features with global contextual information. Then, we design a cross-modal fusion Mamba to perform modality-specific feature enhancement and model the inter-modal correlation between the RGB and depth features. To the best of our knowledge, this work is an innovative attempt to explore the potential of the pure Mamba in the RGB-D SOD task, offering a novel perspective. Numerous experiments conducted on seven prevailing datasets demonstrate our method's superiority over eighteen state-of-the-art RGB-D SOD models. The source code will be released at https://github.com/YueZhan721/MambaSOD. © 2025 Elsevier B.V.

引用

共 50 条

[1] RGB-D salient object detection with asymmetric cross-modal fusion
Yu M.
Xing Z.-H.
Liu Y.
Kongzhi yu Juece/Control and Decision, 2023, 38 (09): : 2487 - 2495
[2] Cross-Modal Fusion and Progressive Decoding Network for RGB-D Salient Object Detection
Hu, Xihang
Sun, Fuming
Sun, Jing
Wang, Fasheng
Li, Haojie
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (08) : 3067 - 3085
[3] Cross-modal hierarchical interaction network for RGB-D salient object detection
Bi, Hongbo
Wu, Ranwan
Liu, Ziqi
Zhu, Huihui
Zhang, Cong
Xiang, Tian -Zhu
PATTERN RECOGNITION, 2023, 136
[4] A cross-modal adaptive gated fusion generative adversarial network for RGB-D salient object detection
Liu, Zhengyi
Zhang, Wei
Zhao, Peng
NEUROCOMPUTING, 2020, 387 : 210 - 220
[5] Depth Enhanced Cross-Modal Cascaded Network for RGB-D Salient Object Detection
Zhao, Zhengyun
Huang, Ziqing
Chai, Xiuli
Wang, Jun
NEURAL PROCESSING LETTERS, 2023, 55 (01) : 361 - 384
[6] Lightweight cross-modal transformer for RGB-D salient object detection
Huang, Nianchang
Yang, Yang
Zhang, Qiang
Han, Jungong
Huang, Jin
Computer Vision and Image Understanding, 2024, 249
[7] Depth Enhanced Cross-Modal Cascaded Network for RGB-D Salient Object Detection
Zhengyun Zhao
Ziqing Huang
Xiuli Chai
Jun Wang
Neural Processing Letters, 2023, 55 : 361 - 384
[8] Attention-aware Cross-modal Cross-level Fusion Network for RGB-D Salient Object Detection
Chen, Hao
Li, You-Fu
Su, Dan
2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 6821 - 6826
[9] RGB-D Salient Object Detection Based on Cross-modal Interactive Fusion and Global Awareness
Sun F.-M.
Hu X.-H.
Wu J.-Y.
Sun J.
Wang F.-S.
Ruan Jian Xue Bao/Journal of Software, 2024, 35 (04): : 1899 - 1913
[10] RGB-D Salient Object Detection Based on Cross-Modal and Cross-Level Feature Fusion
Peng, Yanbin
Zhai, Zhinian
Feng, Mingkun
IEEE ACCESS, 2024, 12 : 45134 - 45146

← 1 2 3 4 5 →