Small Object Augmentation of Urban Scenes for Real-Time Semantic Segmentation

被引:43
|
作者
Yang, Zhengeng [1 ,2 ,3 ]
Yu, Hongshan [1 ,2 ]
Feng, Mingtao [1 ,2 ]
Sun, Wei [1 ,2 ]
Lin, Xuefei [4 ]
Sun, Mingui [3 ,5 ,6 ]
Mao, Zhi-Hong [5 ,6 ]
Mian, Ajmal [7 ]
机构
[1] Hunan Univ, Natl Engn Lab Robot Visual Percept & Control Tech, Coll Elect & Informat Engn, Changsha 410082, Hunan, Peoples R China
[2] Hunan Univ, Shenzhen Inst, Shenzhen 518057, Peoples R China
[3] Univ Pittsburgh, Dept Neurol Surg, Pittsburgh, PA 15260 USA
[4] Hunan Agr Univ, Dept Art, Changsha 410128, Peoples R China
[5] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15260 USA
[6] Univ Pittsburgh, Dept Bioengn, Pittsburgh, PA 15260 USA
[7] Univ Western Australia, Dept Comp Sci, Perth, WA 6009, Australia
基金
美国国家卫生研究院; 湖南省自然科学基金; 中国国家自然科学基金;
关键词
Semantic segmentation; scene understanding; autonomous driving; synthetic dataset; FEATURES; NETWORK;
D O I
10.1109/TIP.2020.2976856
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic segmentation is a key step in scene understanding for autonomous driving. Although deep learning has significantly improved the segmentation accuracy, current high-quality models such as PSPNet and DeepLabV3 are inefficient given their complex architectures and reliance on multi-scale inputs. Thus, it is difficult to apply them to real-time or practical applications. On the other hand, existing real-time methods cannot yet produce satisfactory results on small objects such as traffic lights, which are imperative to safe autonomous driving. In this paper, we improve the performance of real-time semantic segmentation from two perspectives, methodology and data. Specifically, we propose a real-time segmentation model coined Narrow Deep Network (NDNet) and build a synthetic dataset by inserting additional small objects into the training images. The proposed method achieves 65.7% mean intersection over union (mIoU) on the Cityscapes test set with only 8.4G floating-point operations (FLOPs) on $1024\times 2048$ inputs. Furthermore, by re-training the existing PSPNet and DeepLabV3 models on our synthetic dataset, we obtained an average 2% mIoU improvement on small objects.
引用
收藏
页码:5175 / 5190
页数:16
相关论文
共 50 条
  • [21] BSDNet: Balanced Sample Distribution Network for Real-Time Semantic Segmentation of Road Scenes
    Ye, Lv
    Zeng, Jianxu
    Yang, Yue
    Chimaobi, Ashara Emmanuel
    Sekenya, Nyaradzo Mercy
    IEEE ACCESS, 2021, 9 : 84034 - 84044
  • [22] DRMNet: more efficient bilateral networks for real-time semantic segmentation of road scenes
    Zhang, Wenming
    Zhang, Shaotong
    Li, Yaqian
    Li, Haibin
    Song, Tao
    Journal of Real-Time Image Processing, 2024, 21 (06)
  • [24] Triple-Branch Asymmetric Network for Real-time Semantic Segmentation of Road Scenes
    Yazhi Zhang
    Xuguang Zhang
    Hui Yu
    Instrumentation, 2024, 11 (02) : 72 - 82
  • [25] Exploring Scale-Aware Features for Real-Time Semantic Segmentation of Street Scenes
    Li, Kaige
    Geng, Qichuan
    Zhou, Zhong
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (05) : 3575 - 3587
  • [26] Deep Multi-Resolution Network for Real-Time Semantic Segmentation in Street Scenes
    Wang, Yalun
    Chen, Shidong
    Bian, Huicong
    Li, Weixiao
    Lu, Qin
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [27] Gated feature aggregate and alignment network for real-time semantic segmentation of street scenes
    Liu, Qian
    Li, Zhensheng
    Qi, Youwei
    Wang, Cunbao
    MULTIMEDIA SYSTEMS, 2024, 30 (04)
  • [28] BSSNet: A Real-Time Semantic Segmentation Network for Road Scenes Inspired From AutoEncoder
    Shi, Xiaoqiang
    Yin, Zhenyu
    Han, Guangjie
    Liu, Wenzhuo
    Qin, Li
    Bi, Yuanguo
    Li, Shurui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3424 - 3438
  • [29] Real-Time Semantic Clothing Segmentation
    Cushen, George. A.
    Nixon, Mark. S.
    ADVANCES IN VISUAL COMPUTING, ISVC 2012, PT I, 2012, 7431 : 272 - 281
  • [30] Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes
    Li, Gen
    Jiang, Shenlu
    Yun, Inyong
    Kim, Jonghyun
    Kim, Joongkyu
    IEEE ACCESS, 2020, 8 : 27495 - 27506