Semantic Relocation Parallel Network for Semantic Segmentation

被引:0
|
作者
Chen S. [1 ]
Xu L. [1 ]
Zou B. [2 ]
Chen J. [3 ]
机构
[1] School of Computer Science, Xiangtan University, Xiangtan
[2] School of Computer Science and Engineering, Central South University, Changsha
[3] Computer Center, Xiangtan University, Xiangtan
关键词
Feature extractor; Feature fusion; Semantic relocation; Semantic segmentation;
D O I
10.3724/SP.J.1089.2022.18909
中图分类号
学科分类号
摘要
Semantic segmentation is an essential issue in the computer vision field, the difficulty of which lies in the accurate prediction of the pixel level and the edge division of similar objects. The encoder-encoder structure is widely used in many methods to capture the global information of semantic objects. However, continuous subsampling causes irreversible loss of spatial information of the feature map. A parallel semantic relocation (SRPNet) based network is proposed. Specifically, a high-resolution global spatial path is designed to extract rich spatial information in which feature maps have high resolution. In feature extraction path, a powerful feature extractor is used to expand the receptive field by fast subsampling. Besides, a semantic relocation module (SRM) is designed to compensate for the lack of context information caused by multiple subsamples. Dice loss is employed to alleviate the imbalance of positive and negative samples in the dataset and obtain better segmentation performance. Finally, the proposed network is evaluated on the Cityscapes and CamVid dataset. The results show that SRPNet can improve the previous best result by approximately 3.1% and 1.8% measured by mIoU on the CamVid dataset and the Cityscapes dataset, respectively. © 2022, Beijing China Science Journal Publishing Co. Ltd. All right reserved.
引用
下载
收藏
页码:373 / 381
页数:8
相关论文
共 32 条
  • [21] He K M, Zhang X Y, Ren S Q, Et al., Deep residual learning for image recognition, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 770-778, (2016)
  • [22] Hu J, Shen L, Sun G., Squeeze-and-excitation networks, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132-7141, (2018)
  • [23] Wang Q L, Wu B G, Zhu P F, Et al., ECA-Net: efficient channel attention for deep convolutional neural networks, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11531-11539, (2020)
  • [24] Brostow G J, Fauqueur J, Cipolla R., Semantic object classes in video: a high-definition ground truth database, Pattern Recognition Letters, 30, 2, pp. 88-97, (2009)
  • [25] Cordts M, Omran M, Ramos S, Et al., The cityscapes dataset for semantic urban scene understanding, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3213-3223, (2016)
  • [26] Paszke A, Gross S, Chintala S, Et al., Automatic differentiation in PyTorch, Proceedings of Conference and Workshop on Neural Information Processing Systems, pp. 1-4, (2017)
  • [27] Mehta S, Rastegari M, Shapiro L, Et al., ESPNetv2: a light-weight, power efficient, and general purpose convolutional neural network, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9182-9192, (2019)
  • [28] Poudel R P K, Liwicki S, Cipolla R., Fast-SCNN: fast semantic segmentation network
  • [29] Zhang Y H, Qiu Z F, Liu J G, Et al., Customizable architecture search for semantic segmentation, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11633-11642, (2019)
  • [30] Emara T, Munim H E A E, Abbas H M., LiteSeg: a novel lightweight ConvNet for semantic segmentation, Proceedings of Digital Image Computing: Techniques and Applications, pp. 1-7, (2019)