Attention-based fusion network for RGB-D semantic segmentation

被引:0
|
作者
Zhong, Li [1 ]
Guo, Chi [2 ,3 ]
Zhan, Jiao [2 ]
Deng, JingYi [2 ]
机构
[1] Wuhan Univ, Sch Geodesy & Geomat, Wuhan, Hubei, Peoples R China
[2] Wuhan Univ, Res Ctr GNSS, Wuhan 430072, Peoples R China
[3] Hubei Luojia Lab, Wuhan, Peoples R China
关键词
RGB-D semantic segmentation; Cross-modal fusion; Attention mechanism;
D O I
10.1016/j.neucom.2024.128371
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-D semantic segmentation can realize a profound comprehension of scenes, which is crucial in various computer vision tasks. However, due to the inherent modal variances and image noise, achieving superior segmentation using existing methods remains challenging. In this paper, we propose an attention-based fusion network for RGB-D semantic segmentation. Specifically, our network employs a forward multi-step propagation strategy and a backward progressive bootstrap fusion strategy based on the encoder-decoder architecture. By aggregating feature maps at different scales, we effectively diminish the uncertainty in the final prediction. Meanwhile, we introduce a Channel and Spatial Rectification Module (CSRM) to enable multi-dimensional interactions and noise removal. In order to achieve comprehensive integration between RGB and depth images, we put the rectified features into the Cross-Attention Fusion Module(CAFM). Extensive experiments show that our network can adeptly manage a diverse array of complex scenarios, demonstrating its innovative strength with superior performance and robust effectiveness across indoor NYU Depth V2 and SUN-RGBD datasets, and extending its capabilities to the outdoor Cityscapes dataset.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Feature fusion and context interaction for RGB-D indoor semantic segmentation
    Liu, Heng
    Xie, Wen
    Wang, Shaoxun
    APPLIED SOFT COMPUTING, 2024, 167
  • [32] Self-Enhanced Feature Fusion for RGB-D Semantic Segmentation
    Xiang, Pengcheng
    Yao, Baochen
    Jiang, Zefeng
    Peng, Chengbin
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 3015 - 3019
  • [33] TSNet: Three-Stream Self-Attention Network for RGB-D Indoor Semantic Segmentation
    Zhou, Wujie
    Yuan, Jianzhong
    Lei, Jingsheng
    Luo, Ting
    IEEE INTELLIGENT SYSTEMS, 2021, 36 (04) : 73 - 78
  • [34] Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation
    Zhu, Xingyu
    Wang, Xin
    Freer, Jonathan
    Chang, Hyung Jin
    Gao, Yixing
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 9471 - 9477
  • [35] Small Obstacle Avoidance Based on RGB-D Semantic Segmentation
    Hua, Minjie
    Nan, Yibing
    Lian, Shiguo
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 886 - 894
  • [36] GANet: geometry-aware network for RGB-D semantic segmentation
    Tian, Chunqi
    Xu, Weirong
    Bai, Lizhi
    Yang, Jun
    Xu, Yanjun
    APPLIED INTELLIGENCE, 2025, 55 (06)
  • [37] DDNet: Depth Dominant Network for Semantic Segmentation of RGB-D Images
    Rong, Peizhi
    SENSORS, 2024, 24 (21)
  • [38] Intra-inter Modal Attention Blocks for RGB-D Semantic Segmentation
    Choi, Soyun
    Zhang, Youjia
    Hong, Sungeun
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 217 - 225
  • [39] SPNet: An RGB-D Sequence Progressive Network for Road Semantic Segmentation
    Zhou, Zhi
    Zhang, Yuhang
    Hua, Guoguang
    Long, Ruijing
    Tian, Shishun
    Zou, Wenbin
    2023 IEEE 25TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, MMSP, 2023,
  • [40] RGB-D Dual Modal Information Complementary Semantic Segmentation Network
    Wang L.
    Gu N.
    Xin J.
    Wang S.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (10): : 1489 - 1499