Multi-Encoder Context Aggregation Network for Structured and Unstructured Urban Street Scene Analysis

被引:0
|
作者
Singha, Tanmay [1 ]
Pham, Duc-Son [1 ]
Krishna, Aneesh [1 ]
机构
[1] Curtin Univ, Sch Elect Engn Comp & Math Sci, Perth, WA, Australia
关键词
INDEX TERMS Semantic segmentation; feature scaling; feature aggregation; deep learning; scene under-standing; convolutional neural networks; SEGMENTATION;
D O I
10.1109/ACCESS.2023.3289968
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Developing computationally efficient semantic segmentation models that are suitable for resource-constrained mobile devices is an open challenge in computer vision research. To address this challenge, we propose a novel real-time semantic scene segmentation model called Multi-encoder Context Aggregation Network (MCANet), which offers the best combination of low model complexity and state-of-the-art (SOTA) performance on benchmark datasets. While we follow the multi-encoder approach, our novelty lies in the varying number of scales to capture both global context and local details effectively. We introduce suitable lateral connections between sub-encoders for improved feature refinement. We also optimize the backbone by exploiting the residual block of MobileNet for resource-constrained applications. On the decoder side, the proposed model includes a new Local and Global Context Aggregation (LGCA) module that significantly enhances semantic details in the segmentation output. Finally, we use several known efficient convolution techniques for the classification module to make the model more computationally efficient. We provide a comprehensive evaluation of MCANet on multiple datasets containing structured and unstructured urban street scenes. Among the existing real-time models with less than 3 million parameters, the proposed model is more competitive as it achieves the SOTA performance without ImageNet pre-trained weights on both structured and unstructured environments while being more compact for resource-constrained applications.
引用
收藏
页码:66227 / 66244
页数:18
相关论文
共 50 条
  • [21] MEDS-Net: Multi-encoder based self-distilled network with bidirectional maximum fusion for nodule detection
    Usman, Muhammad
    Rehman, Azka
    Shahid, Abdullah
    Latif, Siddique
    Shin, Yeong-Gil
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 129
  • [22] Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation
    Ding, Henghui
    Jiang, Xudong
    Shuai, Bing
    Liu, Ai Qun
    Wang, Gang
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2393 - 2402
  • [23] Multi-attention aggregation network for remote sensing scene classification
    Wang, Xin
    Li, Yingying
    Shi, Aiye
    Zhou, Huiyu
    JOURNAL OF APPLIED REMOTE SENSING, 2023, 17 (04)
  • [24] Multi-Feature Aggregation for Semantic Segmentation of an Urban Scene Point Cloud
    Chen, Jiaqing
    Zhao, Yindi
    Meng, Congtang
    Liu, Yang
    REMOTE SENSING, 2022, 14 (20)
  • [25] Planarity and street network representation in urban form analysis
    Boeing, Geoff
    ENVIRONMENT AND PLANNING B-URBAN ANALYTICS AND CITY SCIENCE, 2020, 47 (05) : 855 - 869
  • [26] Semantic segmentation of urban street scene images based on improved U-Net network
    ZHU Fuzhen
    CUI Jingyi
    ZHU Bing
    LI Huiling
    LIU Yan
    OptoelectronicsLetters, 2023, 19 (03) : 179 - 185
  • [27] Semantic segmentation of urban street scene images based on improved U-Net network
    Zhu, Fuzhen
    Cui, Jingyi
    Zhu, Bing
    Li, Huiling
    Liu, Yan
    OPTOELECTRONICS LETTERS, 2023, 19 (03) : 179 - 185
  • [28] Semantic segmentation of urban street scene images based on improved U-Net network
    Fuzhen Zhu
    Jingyi Cui
    Bing Zhu
    Huiling Li
    Yan Liu
    Optoelectronics Letters, 2023, 19 : 179 - 185
  • [29] TIME-Net: Transformer-Integrated Multi-Encoder Network for limited-angle artifact removal in dual-energy CBCT
    Zhang, Yikun
    Hu, Dianlin
    Yan, Zhihong
    Zhao, Qingxian
    Quan, Guotao
    Luo, Shouhua
    Zhang, Yi
    Chen, Yang
    MEDICAL IMAGE ANALYSIS, 2023, 83
  • [30] Deep Multi-Scale Context Aware Feature Aggregation for Curved Scene Text Detection
    Dai, Pengwen
    Zhang, Hua
    Cao, Xiaochun
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (08) : 1969 - 1984