Deep Multi-Branch Aggregation Network for Real-Time Semantic Segmentation in Street Scenes

被引:24
|
作者
Weng, Xi [1 ]
Yan, Yan [1 ]
Dong, Genshun [1 ]
Shu, Chang [2 ]
Wang, Biao [3 ]
Wang, Hanzi [1 ]
Zhang, Ji [3 ,4 ]
机构
[1] Xiamen Univ, Sch Informat, Fujian Key Lab Sensing & Comp Smart City, Xiamen 361005, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China
[3] Zhejiang Lab, Hangzhou 311101, Peoples R China
[4] Univ Southern Queensland, Sch Math Phys & Comp, Toowoomba, Qld 4350, Australia
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Semantics; Real-time systems; Image segmentation; Lattices; Decoding; Task analysis; Feature extraction; Deep learning; real-time semantic segmentation; lightweight convolutional neural networks; multi-branch aggregation;
D O I
10.1109/TITS.2022.3150350
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Real-time semantic segmentation, which aims to achieve high segmentation accuracy at real-time inference speed, has received substantial attention over the past few years. However, many state-of-the-art real-time semantic segmentation methods tend to sacrifice some spatial details or contextual information for fast inference, thus leading to degradation in segmentation quality. In this paper, we propose a novel Deep Multi-branch Aggregation Network (called DMA-Net) based on the encoder-decoder structure to perform real-time semantic segmentation in street scenes. Specifically, we first adopt ResNet-18 as the encoder to efficiently generate various levels of feature maps from different stages of convolutions. Then, we develop a Multi-branch Aggregation Network (MAN) as the decoder to effectively aggregate different levels of feature maps and capture the multi-scale information. In MAN, a lattice enhanced residual block is designed to enhance feature representations of the network by taking advantage of the lattice structure. Meanwhile, a feature transformation block is introduced to explicitly transform the feature map from the neighboring branch before feature aggregation. Moreover, a global context block is used to exploit the global contextual information. These key components are tightly combined and jointly optimized in a unified network. Extensive experimental results on the challenging Cityscapes and CamVid datasets demonstrate that our proposed DMA-Net respectively obtains 77.0% and 73.6% mean Intersection over Union (mIoU) at the inference speed of 46.7 FPS and 119.8 FPS by only using a single NVIDIA GTX 1080Ti GPU. This shows that DMA-Net provides a good tradeoff between segmentation quality and speed for semantic segmentation in street scenes.
引用
收藏
页码:17224 / 17240
页数:17
相关论文
共 50 条
  • [31] FPANet: Feature pyramid aggregation network for real-time semantic segmentation
    Yun Wu
    Jianyong Jiang
    Zimeng Huang
    Youliang Tian
    Applied Intelligence, 2022, 52 : 3319 - 3336
  • [32] Real-Time Semantic Segmentation Algorithm for Street Scenes Based on Attention Mechanism and Feature Fusion
    Wu, Bao
    Xiong, Xingzhong
    Wang, Yong
    ELECTRONICS, 2024, 13 (18)
  • [33] Dual Attention Dual-Resolution Networks for Real-Time Semantic Segmentation of Street Scenes
    Ye, Baofeng
    Xue, Renzheng
    IEEE ACCESS, 2025, 13 : 588 - 595
  • [34] ESNET: EDGE-BASED SEGMENTATION NETWORK FOR REAL-TIME SEMANTIC SEGMENTATION IN TRAFFIC SCENES
    Lyu, Haoran
    Fu, Huiyuan
    Hu, Xiaojun
    Liu, Liang
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1855 - 1859
  • [35] Compact interactive dual-branch network for real-time semantic segmentation
    Yongsheng Dong
    Haotian Yang
    Yuanhua Pei
    Longchao Shen
    Lintao Zheng
    Peiluan Li
    Complex & Intelligent Systems, 2023, 9 : 6177 - 6190
  • [36] Compact interactive dual-branch network for real-time semantic segmentation
    Dong, Yongsheng
    Yang, Haotian
    Pei, Yuanhua
    Shen, Longchao
    Zheng, Lintao
    Li, Peiluan
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (06) : 6177 - 6190
  • [37] EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation
    Hu, Xuegang
    Ke, Yan
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (02)
  • [38] EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation
    Xuegang Hu
    Yan Ke
    Journal of Real-Time Image Processing, 2024, 21
  • [39] NDNet: Narrow While Deep Network for Real-Time Semantic Segmentation
    Yang, Zhengeng
    Yu, Hongshan
    Fu, Qiang
    Sun, Wei
    Jia, Wenyan
    Sun, Mingui
    Mao, Zhi-Hong
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (09) : 5508 - 5519
  • [40] BSDNet: Balanced Sample Distribution Network for Real-Time Semantic Segmentation of Road Scenes
    Ye, Lv
    Zeng, Jianxu
    Yang, Yue
    Chimaobi, Ashara Emmanuel
    Sekenya, Nyaradzo Mercy
    IEEE ACCESS, 2021, 9 : 84034 - 84044