RGB-D Dual Modal Information Complementary Semantic Segmentation Network

被引:0
|
作者
Wang L. [1 ]
Gu N. [1 ]
Xin J. [1 ]
Wang S. [1 ]
机构
[1] Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing
关键词
attention mechanism; deep learning; encoder-decoder; RGB-D information complementary; RGB-D semantic segmentation;
D O I
10.3724/SP.J.1089.2023.19592
中图分类号
学科分类号
摘要
In order to fully fuse RGB and depth information to further improve the accuracy of semantic segmentation, attention mechanism is introduced to realize the complementary fusion of RGB and depth modal features. The proposed RGB-D dual modal information complementary semantic segmentation network is designed based on encoder-decoder framework, in which the encoder adopts double branch network structure to extract the feature map of RGB image and depth image respectively, and the decoder adopts the structure of layer-by-layer skip connection to gradually integrate semantic information with different granularity to realize pixel-level semantic classification. For the features leaned in the lower layer, the encoder utilizes an RGB-D information complementary module to mutually fuse the feature from one modal to the other modal. The RGB-D information complementary module includes two kinds of attentions, Depth-guided Attention Module (Depth-AM) and RGB-guided Attention Module (RGB-AM). The Depth-AM takes the original depth information as the supplement of RGB features to solve the problem of inaccurate RGB features caused by illumination changes, and the RGB-AM takes the RGB feature as the supplementary information of depth feature to solve the problem of inaccurate depth feature caused by the lack of object texture information. Under the condition of utilizing backbone with same structure, compared with RDF-Net, the proposed RGB-D dual modal information complementary semantic segmentation network has obvious improvements. In details, the mIoU, pixel accuracy and mean pixel are improved by 1.8%, 0.5% and 0.7% on SUNRGB-D dataset, the mIoU, pixel accuracy and mean pixel are improved by 1.8%, 1.3% and 1.9% on NYUv2 dataset. © 2023 Institute of Computing Technology. All rights reserved.
引用
收藏
页码:1489 / 1499
页数:10
相关论文
共 50 条
  • [21] CMPFFNet: Cross-Modal and Progressive Feature Fusion Network for RGB-D Indoor Scene Semantic Segmentation
    Zhou, Wujie
    Xiao, Yuxiang
    Yan, Weiqing
    Yu, Lu
    [J]. IEEE Transactions on Automation Science and Engineering, 2023, : 1 - 11
  • [22] Cross-Modal Transformer for RGB-D semantic segmentation of production workshop objects
    Ru, Qingjun
    Chen, Guangzhu
    Zuo, Tingyu
    Liao, Xiaojuan
    [J]. PATTERN RECOGNITION, 2023, 144
  • [23] ACENet: Auxiliary Context-Information Enhancement Network for RGB-D Indoor Scene Semantic Segmentation
    Zhou, Wujie
    Xu, Gao
    Qiang, Fangfang
    Yu, Lu
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (02): : 1125 - 1129
  • [24] RGB-D joint modelling with scene geometric information for indoor semantic segmentation
    Liu, Hong
    Wu, Wenshan
    Wang, Xiangdong
    Qian, Yueliang
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (17) : 22475 - 22488
  • [25] RGB-D joint modelling with scene geometric information for indoor semantic segmentation
    Hong Liu
    Wenshan Wu
    Xiangdong Wang
    Yueliang Qian
    [J]. Multimedia Tools and Applications, 2018, 77 : 22475 - 22488
  • [26] Interactive Efficient Multi-Task Network for RGB-D Semantic Segmentation
    Xu, Xinhua
    Liu, Jinfu
    Liu, Hong
    [J]. ELECTRONICS, 2023, 12 (18)
  • [27] RAFNet: RGB-D attention feature fusion network for indoor semantic segmentation
    Yan, Xingchao
    Hou, Sujuan
    Karim, Awudu
    Jia, Weikuan
    [J]. DISPLAYS, 2021, 70
  • [28] LinkNet: 2D-3D linked multi-modal network for online semantic segmentation of RGB-D videos
    Cai, Jun-Xiong
    Mu, Tai-Jiang
    Lai, Yu-Kun
    Hu, Shi-Min
    [J]. COMPUTERS & GRAPHICS-UK, 2021, 98 : 37 - 47
  • [29] Attention-Aware and Semantic-Aware Network for RGB-D Indoor Semantic Segmentation
    Duan L.-J.
    Sun Q.-C.
    Qiao Y.-H.
    Chen J.-C.
    Cui G.-Q.
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (02): : 275 - 291
  • [30] 2.5D CONVOLUTION FOR RGB-D SEMANTIC SEGMENTATION
    Xing, Yajie
    Wang, Jingbo
    Chen, Xiaokang
    Zeng, Gang
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1410 - 1414