Deep Multi-Branch Aggregation Network for Real-Time Semantic Segmentation in Street Scenes

被引:24
|
作者
Weng, Xi [1 ]
Yan, Yan [1 ]
Dong, Genshun [1 ]
Shu, Chang [2 ]
Wang, Biao [3 ]
Wang, Hanzi [1 ]
Zhang, Ji [3 ,4 ]
机构
[1] Xiamen Univ, Sch Informat, Fujian Key Lab Sensing & Comp Smart City, Xiamen 361005, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China
[3] Zhejiang Lab, Hangzhou 311101, Peoples R China
[4] Univ Southern Queensland, Sch Math Phys & Comp, Toowoomba, Qld 4350, Australia
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Semantics; Real-time systems; Image segmentation; Lattices; Decoding; Task analysis; Feature extraction; Deep learning; real-time semantic segmentation; lightweight convolutional neural networks; multi-branch aggregation;
D O I
10.1109/TITS.2022.3150350
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Real-time semantic segmentation, which aims to achieve high segmentation accuracy at real-time inference speed, has received substantial attention over the past few years. However, many state-of-the-art real-time semantic segmentation methods tend to sacrifice some spatial details or contextual information for fast inference, thus leading to degradation in segmentation quality. In this paper, we propose a novel Deep Multi-branch Aggregation Network (called DMA-Net) based on the encoder-decoder structure to perform real-time semantic segmentation in street scenes. Specifically, we first adopt ResNet-18 as the encoder to efficiently generate various levels of feature maps from different stages of convolutions. Then, we develop a Multi-branch Aggregation Network (MAN) as the decoder to effectively aggregate different levels of feature maps and capture the multi-scale information. In MAN, a lattice enhanced residual block is designed to enhance feature representations of the network by taking advantage of the lattice structure. Meanwhile, a feature transformation block is introduced to explicitly transform the feature map from the neighboring branch before feature aggregation. Moreover, a global context block is used to exploit the global contextual information. These key components are tightly combined and jointly optimized in a unified network. Extensive experimental results on the challenging Cityscapes and CamVid datasets demonstrate that our proposed DMA-Net respectively obtains 77.0% and 73.6% mean Intersection over Union (mIoU) at the inference speed of 46.7 FPS and 119.8 FPS by only using a single NVIDIA GTX 1080Ti GPU. This shows that DMA-Net provides a good tradeoff between segmentation quality and speed for semantic segmentation in street scenes.
引用
收藏
页码:17224 / 17240
页数:17
相关论文
共 50 条
  • [21] Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes
    Dong, Genshun
    Yan, Yan
    Shen, Chunhua
    Wang, Hanzi
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (06) : 3258 - 3274
  • [22] Multi-scale feature fusion network for real-time semantic segmentation of urban street scenes: enhancing detail retention and accuracy
    Xiang, Shijie
    Zhou, Dong
    Tian, Dan
    VISUAL COMPUTER, 2025,
  • [23] PDBNet: Parallel Dual Branch Network for Real-time Semantic Segmentation
    Yingpeng Dai
    Junzheng Wang
    Jiehao Li
    Jing Li
    International Journal of Control, Automation and Systems, 2022, 20 : 2702 - 2711
  • [24] PDBNet: Parallel Dual Branch Network for Real-time Semantic Segmentation
    Dai, Yingpeng
    Wang, Junzheng
    Li, Jiehao
    Li, Jing
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2022, 20 (08) : 2702 - 2711
  • [25] Multi-branch reverse attention semantic segmentation network for building extraction
    Jiang, Wenxiang
    Chen, Yan
    Wang, Xiaofeng
    Kang, Menglei
    Wang, Mengyuan
    Zhang, Xuejun
    Xu, Lixiang
    Zhang, Cheng
    EGYPTIAN JOURNAL OF REMOTE SENSING AND SPACE SCIENCES, 2024, 27 (01): : 10 - 17
  • [26] Multi-branch reverse attention semantic segmentation network for building extraction
    Jiang, Wenxiang
    Chen, Yan
    Wang, Xiaofeng
    Kang, Menglei
    Wang, Mengyuan
    Zhang, Xuejun
    Xu, Lixiang
    Zhang, Cheng
    Egyptian Journal of Remote Sensing and Space Science, 2024, 27 (01): : 10 - 17
  • [27] Multi-Branch Supervised Learning on Semantic Segmentation
    Chen, Wenxin
    Zhang, Ting
    Zhao, Xing
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 6841 - 6845
  • [28] Joint pyramid attention network for real-time semantic segmentation of urban scenes
    Xuegang Hu
    Liyuan Jing
    Uroosa Sehar
    Applied Intelligence, 2022, 52 : 580 - 594
  • [29] FPANet: Feature pyramid aggregation network for real-time semantic segmentation
    Wu, Yun
    Jiang, Jianyong
    Huang, Zimeng
    Tian, Youliang
    APPLIED INTELLIGENCE, 2022, 52 (03) : 3319 - 3336
  • [30] Joint pyramid attention network for real-time semantic segmentation of urban scenes
    Hu, Xuegang
    Jing, Liyuan
    Sehar, Uroosa
    APPLIED INTELLIGENCE, 2022, 52 (01) : 580 - 594