Attention-based fusion network for RGB-D semantic segmentation

被引:0
|
作者
Zhong, Li [1 ]
Guo, Chi [2 ,3 ]
Zhan, Jiao [2 ]
Deng, JingYi [2 ]
机构
[1] Wuhan Univ, Sch Geodesy & Geomat, Wuhan, Hubei, Peoples R China
[2] Wuhan Univ, Res Ctr GNSS, Wuhan 430072, Peoples R China
[3] Hubei Luojia Lab, Wuhan, Peoples R China
关键词
RGB-D semantic segmentation; Cross-modal fusion; Attention mechanism;
D O I
10.1016/j.neucom.2024.128371
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-D semantic segmentation can realize a profound comprehension of scenes, which is crucial in various computer vision tasks. However, due to the inherent modal variances and image noise, achieving superior segmentation using existing methods remains challenging. In this paper, we propose an attention-based fusion network for RGB-D semantic segmentation. Specifically, our network employs a forward multi-step propagation strategy and a backward progressive bootstrap fusion strategy based on the encoder-decoder architecture. By aggregating feature maps at different scales, we effectively diminish the uncertainty in the final prediction. Meanwhile, we introduce a Channel and Spatial Rectification Module (CSRM) to enable multi-dimensional interactions and noise removal. In order to achieve comprehensive integration between RGB and depth images, we put the rectified features into the Cross-Attention Fusion Module(CAFM). Extensive experiments show that our network can adeptly manage a diverse array of complex scenarios, demonstrating its innovative strength with superior performance and robust effectiveness across indoor NYU Depth V2 and SUN-RGBD datasets, and extending its capabilities to the outdoor Cityscapes dataset.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Attention-based three-branch network for RGB-D indoor semantic segmentation
    Lei, Bo
    Guo, Peiyan
    Jia, Shaoyun
    DIGITAL SIGNAL PROCESSING, 2025, 162
  • [2] Cross-modal attention fusion network for RGB-D semantic segmentation
    Zhao, Qiankun
    Wan, Yingcai
    Xu, Jiqian
    Fang, Lijin
    NEUROCOMPUTING, 2023, 548
  • [3] RAFNet: RGB-D attention feature fusion network for indoor semantic segmentation
    Yan, Xingchao
    Hou, Sujuan
    Karim, Awudu
    Jia, Weikuan
    DISPLAYS, 2021, 70
  • [4] A Fusion Network for Semantic Segmentation Using RGB-D Data
    Yuan, Jiahui
    Zhang, Kun
    Xia, Yifan
    Qi, Lin
    Dong, Junyu
    NINTH INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2017), 2018, 10615
  • [5] The Network of Attention-Aware Multimodal fusion for RGB-D Indoor Semantic Segmentation Method
    Zhao, Qiankun
    Wan, Yingcai
    Fang, Lijin
    Wang, Huaizhen
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 5093 - 5098
  • [6] DCANet: Differential convolution attention network for RGB-D semantic segmentation
    Bai, Lizhi
    Yang, Jun
    Tian, Chunqi
    Sun, Yaoru
    Mao, Maoyu
    Xu, Yanjun
    Xu, Weirong
    PATTERN RECOGNITION, 2025, 162
  • [7] CANet: Co-attention network for RGB-D semantic segmentation
    Zhou, Hao
    Qi, Lu
    Huang, Hai
    Yang, Xu
    Wan, Zhaoliang
    Wen, Xianglong
    PATTERN RECOGNITION, 2022, 124
  • [8] CDMANet: central difference mutual attention network for RGB-D semantic segmentation
    Ge, Mengjiao
    Su, Wen
    Gao, Jinfeng
    Jia, Guoqiang
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [9] Transformer fusion for indoor RGB-D semantic segmentation
    Wu, Zongwei
    Zhou, Zhuyun
    Allibert, Guillaume
    Stolz, Christophe
    Demonceaux, Cedric
    Ma, Chao
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [10] PSCNET: EFFICIENT RGB-D SEMANTIC SEGMENTATION PARALLEL NETWORK BASED ON SPATIAL AND CHANNEL ATTENTION
    Du, S. Q.
    Tang, S. J.
    Wang, W. X.
    Li, X. M.
    Lu, Y. H.
    Guo, R. Z.
    XXIV ISPRS CONGRESS: IMAGING TODAY, FORESEEING TOMORROW, COMMISSION I, 2022, 5-1 : 129 - 136