Attention-based fusion network for RGB-D semantic segmentation

被引：0

作者：

Zhong, Li ^{[1
]}

Guo, Chi ^{[2
,3
]}

Zhan, Jiao ^{[2
]}

Deng, JingYi ^{[2
]}

机构：

[1] Wuhan Univ, Sch Geodesy & Geomat, Wuhan, Hubei, Peoples R China

[2] Wuhan Univ, Res Ctr GNSS, Wuhan 430072, Peoples R China

[3] Hubei Luojia Lab, Wuhan, Peoples R China

来源：

NEUROCOMPUTING | 2024年 / 608卷

关键词：

RGB-D semantic segmentation; Cross-modal fusion; Attention mechanism;

D O I：

10.1016/j.neucom.2024.128371

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

RGB-D semantic segmentation can realize a profound comprehension of scenes, which is crucial in various computer vision tasks. However, due to the inherent modal variances and image noise, achieving superior segmentation using existing methods remains challenging. In this paper, we propose an attention-based fusion network for RGB-D semantic segmentation. Specifically, our network employs a forward multi-step propagation strategy and a backward progressive bootstrap fusion strategy based on the encoder-decoder architecture. By aggregating feature maps at different scales, we effectively diminish the uncertainty in the final prediction. Meanwhile, we introduce a Channel and Spatial Rectification Module (CSRM) to enable multi-dimensional interactions and noise removal. In order to achieve comprehensive integration between RGB and depth images, we put the rectified features into the Cross-Attention Fusion Module(CAFM). Extensive experiments show that our network can adeptly manage a diverse array of complex scenarios, demonstrating its innovative strength with superior performance and robust effectiveness across indoor NYU Depth V2 and SUN-RGBD datasets, and extending its capabilities to the outdoor Cityscapes dataset.

引用

页数：12

共 50 条

[31] Feature fusion and context interaction for RGB-D indoor semantic segmentation
Liu, Heng
Xie, Wen
Wang, Shaoxun
APPLIED SOFT COMPUTING, 2024, 167
[32] Self-Enhanced Feature Fusion for RGB-D Semantic Segmentation
Xiang, Pengcheng
Yao, Baochen
Jiang, Zefeng
Peng, Chengbin
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 3015 - 3019
[33] TSNet: Three-Stream Self-Attention Network for RGB-D Indoor Semantic Segmentation
Zhou, Wujie
Yuan, Jianzhong
Lei, Jingsheng
Luo, Ting
IEEE INTELLIGENT SYSTEMS, 2021, 36 (04) : 73 - 78
[34] Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation
Zhu, Xingyu
Wang, Xin
Freer, Jonathan
Chang, Hyung Jin
Gao, Yixing
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 9471 - 9477
[35] Small Obstacle Avoidance Based on RGB-D Semantic Segmentation
Hua, Minjie
Nan, Yibing
Lian, Shiguo
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 886 - 894
[36] GANet: geometry-aware network for RGB-D semantic segmentation
Tian, Chunqi
Xu, Weirong
Bai, Lizhi
Yang, Jun
Xu, Yanjun
APPLIED INTELLIGENCE, 2025, 55 (06)
[37] DDNet: Depth Dominant Network for Semantic Segmentation of RGB-D Images
Rong, Peizhi
SENSORS, 2024, 24 (21)
[38] Intra-inter Modal Attention Blocks for RGB-D Semantic Segmentation
Choi, Soyun
Zhang, Youjia
Hong, Sungeun
PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 217 - 225
[39] SPNet: An RGB-D Sequence Progressive Network for Road Semantic Segmentation
Zhou, Zhi
Zhang, Yuhang
Hua, Guoguang
Long, Ruijing
Tian, Shishun
Zou, Wenbin
2023 IEEE 25TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, MMSP, 2023,
[40] RGB-D Dual Modal Information Complementary Semantic Segmentation Network
Wang L.
Gu N.
Xin J.
Wang S.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (10): : 1489 - 1499

← 1 2 3 4 5 →