Multi-Encoder Context Aggregation Network for Structured and Unstructured Urban Street Scene Analysis

被引:0
|
作者
Singha, Tanmay [1 ]
Pham, Duc-Son [1 ]
Krishna, Aneesh [1 ]
机构
[1] Curtin Univ, Sch Elect Engn Comp & Math Sci, Perth, WA, Australia
关键词
INDEX TERMS Semantic segmentation; feature scaling; feature aggregation; deep learning; scene under-standing; convolutional neural networks; SEGMENTATION;
D O I
10.1109/ACCESS.2023.3289968
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Developing computationally efficient semantic segmentation models that are suitable for resource-constrained mobile devices is an open challenge in computer vision research. To address this challenge, we propose a novel real-time semantic scene segmentation model called Multi-encoder Context Aggregation Network (MCANet), which offers the best combination of low model complexity and state-of-the-art (SOTA) performance on benchmark datasets. While we follow the multi-encoder approach, our novelty lies in the varying number of scales to capture both global context and local details effectively. We introduce suitable lateral connections between sub-encoders for improved feature refinement. We also optimize the backbone by exploiting the residual block of MobileNet for resource-constrained applications. On the decoder side, the proposed model includes a new Local and Global Context Aggregation (LGCA) module that significantly enhances semantic details in the segmentation output. Finally, we use several known efficient convolution techniques for the classification module to make the model more computationally efficient. We provide a comprehensive evaluation of MCANet on multiple datasets containing structured and unstructured urban street scenes. Among the existing real-time models with less than 3 million parameters, the proposed model is more competitive as it achieves the SOTA performance without ImageNet pre-trained weights on both structured and unstructured environments while being more compact for resource-constrained applications.
引用
收藏
页码:66227 / 66244
页数:18
相关论文
共 50 条
  • [1] Urban street scene analysis using lightweight multi-level multi-path feature aggregation network
    Singha, Tanmay
    Pham, Duc-Son
    Krishna, Aneesh
    MULTIAGENT AND GRID SYSTEMS, 2021, 17 (03) : 249 - 271
  • [2] Sparse-to-Dense Multi-Encoder Shape Completion of Unstructured Point Cloud
    Peng, Yanjun
    Chang, Ming
    Wang, Qiong
    Qian, Yinling
    Zhang, Yingkui
    Wei, Mingqiang
    Liao, Xiangyun
    IEEE ACCESS, 2020, 8 : 30969 - 30978
  • [3] Multi-Encoder Sequential Attention Network for Context-Aware Speech Recognition in Japanese Dialog Conversation
    Tachimori, Nobuya
    Sakti, Sakriani
    Nakamura, Satoshi
    2021 24th Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2021, 2021, : 1 - 6
  • [4] MULTI-ENCODER SEQUENTIAL ATTENTION NETWORK FOR CONTEXT-AWARE SPEECH RECOGNITION IN JAPANESE DIALOG CONVERSATION
    Tachimori, Nobuya
    Sakti, Sakriani
    Nakamura, Satoshi
    2021 24TH CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2021, : 1 - 6
  • [5] Detecting Multiple Steganography Methods in Speech Streams Using Multi-Encoder Network
    Tian, Hui
    Wu, Junyan
    Quan, Hanyu
    Chang, Chin-Chen
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2462 - 2466
  • [6] Relation Extraction using Multi-Encoder LSTM Network on a Distant Supervised Dataset
    Banerjee, Siddhartha
    Tsioutsiouliklis, Kostas
    2018 IEEE 12TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2018, : 235 - 238
  • [7] Multi-encoder Network for Parameter Reduction of a Kernel-based Interpolation Architecture
    Khalifeh, Issa
    Blanch, Marc Gorriz
    Izquierdo, Ebroul
    Mrak, Marta
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 724 - 733
  • [8] MULTI-ENCODER PARSE-DECODER NETWORK FOR SEQUENTIAL MEDICAL IMAGE SEGMENTATION
    Shi, Dachuan
    Liu, Ruiyang
    Tao, Linmi
    He, Zuoxiang
    Huo, Li
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 31 - 35
  • [9] Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation
    Li, Bei
    Liu, Hui
    Wang, Ziyang
    Jiang, Yufan
    Xiao, Tong
    Zhu, Jingbo
    Liu, Tongran
    Li, Changliang
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3512 - 3518
  • [10] Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models
    Lupo, Lorenzo
    Dinarelli, Marco
    Besacier, Laurent
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4557 - 4572