RGB-D Dual Modal Information Complementary Semantic Segmentation Network

被引：0

作者：

Wang L. ^{[1
]}

Gu N. ^{[1
]}

Xin J. ^{[1
]}

Wang S. ^{[1
]}

机构：

[1] Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing

来源：

Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics | 2023年 / 35卷 / 10期

关键词：

attention mechanism; deep learning; encoder-decoder; RGB-D information complementary; RGB-D semantic segmentation;

D O I：

10.3724/SP.J.1089.2023.19592

中图分类号：

学科分类号：

摘要：

In order to fully fuse RGB and depth information to further improve the accuracy of semantic segmentation, attention mechanism is introduced to realize the complementary fusion of RGB and depth modal features. The proposed RGB-D dual modal information complementary semantic segmentation network is designed based on encoder-decoder framework, in which the encoder adopts double branch network structure to extract the feature map of RGB image and depth image respectively, and the decoder adopts the structure of layer-by-layer skip connection to gradually integrate semantic information with different granularity to realize pixel-level semantic classification. For the features leaned in the lower layer, the encoder utilizes an RGB-D information complementary module to mutually fuse the feature from one modal to the other modal. The RGB-D information complementary module includes two kinds of attentions, Depth-guided Attention Module (Depth-AM) and RGB-guided Attention Module (RGB-AM). The Depth-AM takes the original depth information as the supplement of RGB features to solve the problem of inaccurate RGB features caused by illumination changes, and the RGB-AM takes the RGB feature as the supplementary information of depth feature to solve the problem of inaccurate depth feature caused by the lack of object texture information. Under the condition of utilizing backbone with same structure, compared with RDF-Net, the proposed RGB-D dual modal information complementary semantic segmentation network has obvious improvements. In details, the mIoU, pixel accuracy and mean pixel are improved by 1.8%, 0.5% and 0.7% on SUNRGB-D dataset, the mIoU, pixel accuracy and mean pixel are improved by 1.8%, 1.3% and 1.9% on NYUv2 dataset. © 2023 Institute of Computing Technology. All rights reserved.

引用

页码：1489 / 1499

页数：10

共 50 条

[21] CMPFFNet: Cross-Modal and Progressive Feature Fusion Network for RGB-D Indoor Scene Semantic Segmentation
Zhou, Wujie
Xiao, Yuxiang
Yan, Weiqing
Yu, Lu
[J]. IEEE Transactions on Automation Science and Engineering, 2023, : 1 - 11
[22] Cross-Modal Transformer for RGB-D semantic segmentation of production workshop objects
Ru, Qingjun
Chen, Guangzhu
Zuo, Tingyu
Liao, Xiaojuan
[J]. PATTERN RECOGNITION, 2023, 144
[23] ACENet: Auxiliary Context-Information Enhancement Network for RGB-D Indoor Scene Semantic Segmentation
Zhou, Wujie
Xu, Gao
Qiang, Fangfang
Yu, Lu
[J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (02): : 1125 - 1129
[24] RGB-D joint modelling with scene geometric information for indoor semantic segmentation
Liu, Hong
Wu, Wenshan
Wang, Xiangdong
Qian, Yueliang
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (17) : 22475 - 22488
[25] RGB-D joint modelling with scene geometric information for indoor semantic segmentation
Hong Liu
Wenshan Wu
Xiangdong Wang
Yueliang Qian
[J]. Multimedia Tools and Applications, 2018, 77 : 22475 - 22488
[26] Interactive Efficient Multi-Task Network for RGB-D Semantic Segmentation
Xu, Xinhua
Liu, Jinfu
Liu, Hong
[J]. ELECTRONICS, 2023, 12 (18)
[27] RAFNet: RGB-D attention feature fusion network for indoor semantic segmentation
Yan, Xingchao
Hou, Sujuan
Karim, Awudu
Jia, Weikuan
[J]. DISPLAYS, 2021, 70
[28] LinkNet: 2D-3D linked multi-modal network for online semantic segmentation of RGB-D videos
Cai, Jun-Xiong
Mu, Tai-Jiang
Lai, Yu-Kun
Hu, Shi-Min
[J]. COMPUTERS & GRAPHICS-UK, 2021, 98 : 37 - 47
[29] Attention-Aware and Semantic-Aware Network for RGB-D Indoor Semantic Segmentation
Duan L.-J.
Sun Q.-C.
Qiao Y.-H.
Chen J.-C.
Cui G.-Q.
[J]. Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (02): : 275 - 291
[30] 2.5D CONVOLUTION FOR RGB-D SEMANTIC SEGMENTATION
Xing, Yajie
Wang, Jingbo
Chen, Xiaokang
Zeng, Gang
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1410 - 1414

← 1 2 3 4 5 →