Efficient pyramid context encoding and feature embedding for semantic segmentation

被引:9
|
作者
Liu, Mengyu [1 ]
Yin, Hujun [1 ]
机构
[1] Univ Manchester, Dept Elect & Elect Engn, Manchester, Lancs, England
关键词
Semantic segmentation; Convolutional neural networks; Pyramid context encoding; Real-time processing;
D O I
10.1016/j.imavis.2021.104195
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For reality applications of semantic segmentation, inference speed and memory usage are two important factors. To address these challenges, we propose a lightweight feature pyramid encoding network (FPENet) for semantic segmentation with a good trade-off between accuracy and speed. We use a series of feature pyramid encoding (FPE) blocks to encode context at multiple scales in the encoder. Each FPE block consists of different depthwise dilated convolutions that perform as a spatial pyramid to extract features and reduce computational costs. During training, a one-shot neural architecture search algorithm is adopted to find the optimal structure for each FPE block from a large search space with a small search cost. After the search for the encoder, a mutual embedding upsample module is introduced in the decoder, consisting of two attention blocks. The encoder-decoder attention mechanism is used to help aggregate efficiently high-level semantic features and low-level spatial details. The proposed network outperforms the existing real-time methods with fewer parameters and improved inference speed on the Cityscapes and CamVid benchmark datasets. Specifically, it achieved 72.3% mean IoU on the Cityscapes test set with only 0.4 M parameters and 192.6 FPS speed on an Nvidia Titan V100 GPU, and 73.4% mean IoU with 116.2 FPS when running on higher resolution images. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Semantic Segmentation Based on Spatial Pyramid Pooling and Multilayer Feature Fusion
    Ji, Jian
    Li, Sitong
    Liao, Xianfu
    Zhang, Fangrong
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2023, 15 (03) : 1524 - 1535
  • [22] Global and Compact Video Context Embedding for Video Semantic Segmentation
    Sun, Lei
    Liu, Yun
    Sun, Guolei
    Wu, Min
    Xu, Zhijie
    Wang, Kaiwei
    Van Gool, Luc
    IEEE ACCESS, 2024, 12 : 135589 - 135600
  • [23] Context propagation embedding network for weakly supervised semantic segmentation
    Yajun Xu
    Zhendong Mao
    Zhineng Chen
    Xin Wen
    Yangyang Li
    Multimedia Tools and Applications, 2020, 79 : 33925 - 33942
  • [24] Context propagation embedding network for weakly supervised semantic segmentation
    Xu, Yajun
    Mao, Zhendong
    Chen, Zhineng
    Wen, Xin
    Li, Yangyang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (45-46) : 33925 - 33942
  • [25] Semantic Segmentation With Context Encoding and Multi-Path Decoding
    Ding, Henghui
    Jiang, Xudong
    Shuai, Bing
    Liu, Ai Qun
    Wang, Gang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 3520 - 3533
  • [26] Triple fusion and feature pyramid decoder for RGB-D semantic segmentation
    Ge, Bin
    Zhu, Xu
    Tang, Zihan
    Xia, Chenxing
    Lu, Yiming
    Chen, Zhuang
    MULTIMEDIA SYSTEMS, 2024, 30 (05)
  • [27] FPANet: Feature pyramid aggregation network for real-time semantic segmentation
    Wu, Yun
    Jiang, Jianyong
    Huang, Zimeng
    Tian, Youliang
    APPLIED INTELLIGENCE, 2022, 52 (03) : 3319 - 3336
  • [28] FPANet: Feature pyramid aggregation network for real-time semantic segmentation
    Yun Wu
    Jianyong Jiang
    Zimeng Huang
    Youliang Tian
    Applied Intelligence, 2022, 52 : 3319 - 3336
  • [29] Enhanced Feature Pyramid Network With Deep Semantic Embedding for Remote Sensing Scene Classification
    Wang, Xin
    Wang, Shiyi
    Ning, Chen
    Zhou, Huiyu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (09): : 7918 - 7932
  • [30] DSMRSeg: Dual-Stage Feature Pyramid and Multi-Range Context Aggregation for Real-Time Semantic Segmentation
    Yang, Mingdong
    Shi, Ying
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT IV, 2019, 1142 : 265 - 273