SDDS-Net: Space and Depth Encoder-Decoder Convolutional Neural Networks for Real-Time Semantic Segmentation

被引:1
|
作者
Ibrahem, Hatem [1 ]
Salem, Ahmed [1 ,2 ]
Kang, Hyun-Soo [1 ]
机构
[1] Chungbuk Natl Univ, Sch Elect & Comp Engn, Cheongju 28644, South Korea
[2] Assiut Univ, Fac Engn, Dept Elect Engn, Assiut 71515, Egypt
关键词
Convolutional neural networks; Real-time systems; image classification; image-to-image translation; real-time processing; semantic segmentation; ARCHITECTURES; EFFICIENT;
D O I
10.1109/ACCESS.2023.3327323
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose novel convolutional encoder-decoder architectures for real-time semantic segmentation based on an image-to-image translation approach via the space-to-depth and depth-to-space modules. We present architectures that compress the spatial information of the image using the space-to-depth (SD) instead of the commonly used pooling methods (Max-pooling and Average-pooling) or strided convolution approaches. The SD module can reduce the image size while preserving the spatial information of the image in the form of extra depth information, this approach is much better than the pooling approaches which introduce a loss in the information and the details of the image. We also propose a lightweight and simple decoder stage using the depth-to-space (DS) module which constructs a high-resolution dense prediction map from a large number of low-resolution feature maps. The proposed architectures are efficient in learning image classification and semantic segmentation with high accuracy and average processing speed. We trained and tested our proposed architectures on image classification (i.e. CIFAR10 and Tiny ImageNet), and indoor and outdoor benchmarks for semantic segmentation specifically NYU-depthV2 and CITYSCAPES. The proposed architectures could attain high accuracy in classification (94.28% on CIFAR10 and 72.25% on Tiny ImageNet) and high mean average precision and pixel accuracy values in semantic segmentation (pixel accuracy of 78.55% on NYU-depthV2 and 87.9% on CITYSCAPES) while maintaining a real-time speed of frame processing outperforming recent state-of-the-art methods in semantic segmentation.
引用
收藏
页码:119362 / 119372
页数:11
相关论文
共 50 条
  • [1] LEDNET: A LIGHTWEIGHT ENCODER-DECODER NETWORK FOR REAL-TIME SEMANTIC SEGMENTATION
    Wang, Yu
    Zhou, Quan
    Liu, Jia
    Xiong, Jian
    Gao, Guangwei
    Wu, Xiaofu
    Latecki, Longin Jan
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1860 - 1864
  • [2] PPEDNet: Pyramid Pooling Encoder-Decoder Network for Real-Time Semantic Segmentation
    Tan, Zhentao
    Liu, Bin
    Yu, Nenghai
    [J]. IMAGE AND GRAPHICS (ICIG 2017), PT I, 2017, 10666 : 328 - 339
  • [3] Fast Real-time Semantic Segmentation Network with an Asymmetric Encoder-Decoder Structure
    Rui, Tang
    Yan, Li Hui
    Kai, Xu
    Yi, Ding
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 2408 - 2413
  • [4] Convolutional neural network based encoder-decoder architectures for semantic segmentation of plants
    Kolhar, Shrikrishna
    Jagtap, Jayant
    [J]. ECOLOGICAL INFORMATICS, 2021, 64
  • [5] Real-time semantic segmentation of microvascular decompression images based on encoder-decoder structure
    Bai Rui-feng
    Jiang Shan
    Sun Hai-jiang
    Liu Xin-rui
    [J]. CHINESE OPTICS, 2022, 15 (05) : 1055 - 1065
  • [6] Semantic Segmentation of Remote Sensing Image Based on Encoder-Decoder Convolutional Neural Network
    Zhang Zhehan
    Fang Wei
    Du Lili
    Qiao Yanli
    Zhang Dongying
    Ding Guoshen
    [J]. ACTA OPTICA SINICA, 2020, 40 (03)
  • [7] Semantic Translation with Convolutional Encoder-decoder Networks for Viewpoint Estimation
    Zhang, Liangjun
    Gu, Changjian
    Gu, Chaochen
    Wu, Kaijie
    Guan, Xinping
    [J]. 2017 11TH ASIAN CONTROL CONFERENCE (ASCC), 2017, : 1660 - 1665
  • [8] Encoder-decoder with densely convolutional networks for monocular depth estimation
    Chen, Songnan
    Tang, Mengxia
    Kan, Jiangming
    [J]. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2019, 36 (10) : 1709 - 1718
  • [9] J-Net: Asymmetric Encoder-Decoder for Medical Semantic Segmentation
    Shi, Yanli
    Sheng, Pengpeng
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [10] Cascaded deep convolutional encoder-decoder neural networks for efficient liver tumor segmentation
    Budak, Umit
    Guo, Yanhui
    Tanyildizi, Erkan
    Sengur, Abdulkadir
    [J]. MEDICAL HYPOTHESES, 2020, 134