Efficient pyramid context encoding and feature embedding for semantic segmentation

被引:9
|
作者
Liu, Mengyu [1 ]
Yin, Hujun [1 ]
机构
[1] Univ Manchester, Dept Elect & Elect Engn, Manchester, Lancs, England
关键词
Semantic segmentation; Convolutional neural networks; Pyramid context encoding; Real-time processing;
D O I
10.1016/j.imavis.2021.104195
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For reality applications of semantic segmentation, inference speed and memory usage are two important factors. To address these challenges, we propose a lightweight feature pyramid encoding network (FPENet) for semantic segmentation with a good trade-off between accuracy and speed. We use a series of feature pyramid encoding (FPE) blocks to encode context at multiple scales in the encoder. Each FPE block consists of different depthwise dilated convolutions that perform as a spatial pyramid to extract features and reduce computational costs. During training, a one-shot neural architecture search algorithm is adopted to find the optimal structure for each FPE block from a large search space with a small search cost. After the search for the encoder, a mutual embedding upsample module is introduced in the decoder, consisting of two attention blocks. The encoder-decoder attention mechanism is used to help aggregate efficiently high-level semantic features and low-level spatial details. The proposed network outperforms the existing real-time methods with fewer parameters and improved inference speed on the Cityscapes and CamVid benchmark datasets. Specifically, it achieved 72.3% mean IoU on the Cityscapes test set with only 0.4 M parameters and 192.6 FPS speed on an Nvidia Titan V100 GPU, and 73.4% mean IoU with 116.2 FPS when running on higher resolution images. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Deep Common Feature Mining for Efficient Video Semantic Segmentation
    Zheng, Yaoyan
    Yang, Hongyu
    Huang, Di
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 12991 - 13003
  • [42] Context and Spatial Feature Calibration for Real-Time Semantic Segmentation
    Li, Kaige
    Geng, Qichuan
    Wan, Maoxian
    Cao, Xiaochun
    Zhou, Zhong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5465 - 5477
  • [43] Multilevel Context Feature Fusion for Semantic Segmentation of ALS Point Cloud
    Zeng, Tao
    Luo, Fulin
    Guo, Tan
    Gong, Xiuwen
    Xue, Jingyun
    Li, Hanshan
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [44] Semantic Segmentation of Breast Ultrasound Image with Pyramid Fuzzy Uncertainty Reduction and Direction Connectedness Feature
    Huang, Kuan
    Zhang, Yingtao
    Cheng, H. D.
    Xing, Ping
    Zhang, Boyu
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3357 - 3364
  • [45] Semantic Context Encoding for Accurate 3D Point Cloud Segmentation
    Liu, Hao
    Guo, Yulan
    Ma, Yanni
    Lei, Yinjie
    Wen, Gongjian
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 2045 - 2055
  • [46] Enhanced Feature Pyramid Vision Transformer for Semantic Segmentation on Thailand Landsat-8 Corpus
    Intarat, Kritchayan
    Rakwatin, Preesan
    Panboonyuen, Teerapong
    INFORMATION, 2022, 13 (05)
  • [47] Real-time semantic segmentation method for field grapes based on channel feature pyramid
    Sun J.
    Gong D.
    Yao K.
    Lu B.
    Dai C.
    Wu X.
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2022, 38 (17): : 150 - 157
  • [48] Multilevel Geometric Feature Embedding in Transformer Network for ALS Point Cloud Semantic Segmentation
    Liang, Zhuanxin
    Lai, Xudong
    REMOTE SENSING, 2024, 16 (18)
  • [49] Efficient Parallel Multi-Scale Detail and Semantic Encoding Network for Lightweight Semantic Segmentation
    Liu, Xiao
    Shi, Xiuya
    Chen, Lufei
    Qing, Linbo
    Ren, Chao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2544 - 2552
  • [50] Mixed spatial pyramid pooling for semantic segmentation
    Xia, Zhengyu
    Kim, Joohee
    APPLIED SOFT COMPUTING, 2020, 91