Joint pyramid attention network for real-time semantic segmentation of urban scenes

被引:0
|
作者
Hu, Xuegang [1 ]
Jing, Liyuan [2 ]
Sehar, Uroosa [3 ]
机构
[1] Chongqing Univ Posts & Telecommun, Key Lab Intelligent Anal & Decis Complex Syst, Chongqing 400065, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Multimedia Commun Res Lab, Chongqing 400065, Peoples R China
[3] Northeastern Univ, Cross Media Artificial Intelligence Lab, Shenyang 110000, Peoples R China
基金
中国国家自然科学基金;
关键词
Attention mechanism; Encoder-decoder network; Feature pyramid module; Lightweight network; Real-time semantic segmentation; NEURAL-NETWORK;
D O I
10.1007/s10489-021-02446-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic segmentation is an advanced research topic in computer vision and can be regarded as a fundamental technique for image understanding and analysis. However, most of the current semantic segmentation networks only focus on segmentation accuracy while ignoring the requirements for high processing speed and low computational complexity in mobile terminal fields such as autonomous driving systems, drone applications, and fingerprint recognition systems. Aiming at the problems that the current semantic segmentation task are facing, it is difficult to meet the actual industrial needs due to its high computational cost. We propose a joint pyramid attention network (JPANet) for real-time semantic segmentation. First, we propose a joint feature pyramid (JFP) module, which can combine multiple network stages with learning multi-scale feature representations with strong semantic information, hence improving pixel classification performance. Second, we built a spatial detail extraction (SDE) module to capture the shallow network multi-level local features and make up for the geometric information lost in the down-sampling stage. Finally, we design a bilateral feature fusion (BFF) module, which properly integrates spatial information and semantic information through a hybrid attention mechanism in spatial dimensions and channel dimensions, making full use of the correspondence between high-level features and low-level features. We conducted a series of experiments on two challenging urban road scene datasets (Cityscapes and CamVid) and achieved excellent results. Among them, the experimental results on the Cityscapes dataset show that for 512 x 1024 high-resolution images, our method achieves 71.62% Mean Intersection over Union (mIoU) with 109.9 frames per second (FPS) on a single 1080Ti GPU.
引用
收藏
页码:580 / 594
页数:15
相关论文
共 50 条
  • [1] Joint pyramid attention network for real-time semantic segmentation of urban scenes
    Xuegang Hu
    Liyuan Jing
    Uroosa Sehar
    [J]. Applied Intelligence, 2022, 52 : 580 - 594
  • [2] DSANet: Dilated spatial attention for real-time semantic segmentation in urban street scenes
    Elhassan, Mohammed A. M.
    Huang, Chenxi
    Yang, Chenhui
    Munea, Tewodros Legesse
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 183
  • [3] Satellite Component Semantic Segmentation: Video Dataset and Real-Time Pyramid Attention and Decoupled Attention Network
    Shao, Yadong
    Wu, Aodi
    Li, Shengyang
    Shu, Leizheng
    Wan, Xue
    Shao, Yuanbin
    Huo, Junyan
    [J]. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2023, 59 (06) : 7315 - 7333
  • [4] Research on Efficient Asymmetric Attention Module for Real-Time Semantic Segmentation Networks in Urban Scenes
    Su, Xu
    Li, Lihong
    Xiao, Jiejie
    Wang, Pengtao
    [J]. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2024, 28 (03) : 562 - 572
  • [5] RTSNet: Real-Time Semantic Segmentation Network For Outdoor Scenes
    Ma, Mingyu
    Zou, Fengshan
    Xu, Fang
    Song, Jilai
    [J]. 2019 9TH IEEE ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER 2019), 2019, : 659 - 664
  • [6] Small Object Augmentation of Urban Scenes for Real-Time Semantic Segmentation
    Yang, Zhengeng
    Yu, Hongshan
    Feng, Mingtao
    Sun, Wei
    Lin, Xuefei
    Sun, Mingui
    Mao, Zhi-Hong
    Mian, Ajmal
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 5175 - 5190
  • [7] LCFNet: Loss Compensation Fusion Network for Real-time Semantic Segmentation of Urban Road Scenes
    Yang, Lu
    Bai, Yiwen
    Ren, Fenglei
    Zhang, Shiyu
    Bi, Chongke
    [J]. 2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 347 - 354
  • [8] FPANet: Feature pyramid aggregation network for real-time semantic segmentation
    Wu, Yun
    Jiang, Jianyong
    Huang, Zimeng
    Tian, Youliang
    [J]. APPLIED INTELLIGENCE, 2022, 52 (03) : 3319 - 3336
  • [9] FPANet: Feature pyramid aggregation network for real-time semantic segmentation
    Yun Wu
    Jianyong Jiang
    Zimeng Huang
    Youliang Tian
    [J]. Applied Intelligence, 2022, 52 : 3319 - 3336
  • [10] Parallel Complement Network for Real-Time Semantic Segmentation of Road Scenes
    Lv, Qingxuan
    Sun, Xin
    Chen, Changrui
    Dong, Junyu
    Zhou, Huiyu
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (05) : 4432 - 4444