MiniNet: An extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation

被引:25
|
作者
Liu, Jun [1 ,2 ,3 ]
Li, Qing [1 ,2 ,3 ]
Cao, Rui [1 ,2 ,3 ]
Tang, Wenming [1 ,2 ,3 ]
Qiu, Guoping [1 ,2 ,3 ,4 ]
机构
[1] Shenzhen Univ, Coll Elect & Informat Engn, Shenzhen, Peoples R China
[2] Shenzhen Univ, Guangdong Key Lab Intelligent Informat Proc, Shenzhen, Peoples R China
[3] Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen, Peoples R China
[4] Univ Nottingham, Sch Comp Sci, Nottingham, England
关键词
Monocular depth estimation; Convolutional neural network; Unsupervised learning; Lightweight; Real-time; INDOOR;
D O I
10.1016/j.isprsjprs.2020.06.004
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
Predicting depth from a single image is an attractive research topic since it provides one more dimension of information to enable machines to better perceive the world. Recently, deep learning has emerged as an effective approach to monocular depth estimation. As obtaining labeled data is costly, there is a recent trend to move from supervised learning to unsupervised learning to obtain monocular depth. However, most unsupervised learning methods capable of achieving high depth prediction accuracy will require a deep network architecture which will be too heavy and complex to run on embedded devices with limited storage and memory spaces. To address this issue, we propose a new powerful network with a recurrent module to achieve the capability of a deep network while at the same time maintaining an extremely lightweight size for real-time high performance unsupervised monocular depth prediction from video sequences. Besides, a novel efficient upsample block is proposed to fuse the features from the associated encoder layer and recover the spatial size of features with the small number of model parameters. We validate the effectiveness of our approach via extensive experiments on the KITTI dataset. Our new model can run at a speed of about 110 frames per second (fps) on a single GPU, 37 fps on a single CPU, and 2 fps on a Raspberry Pi 3. Moreover, it achieves higher depth accuracy with nearly 33 times fewer model parameters than state-of-the-art models. To the best of our knowledge, this work is the first extremely lightweight neural network trained on monocular video sequences for real-time unsupervised monocular depth estimation, which opens up the possibility of implementing deep learning-based real-time unsupervised monocular depth prediction on low-cost embedded devices.
引用
收藏
页码:255 / 267
页数:13
相关论文
共 50 条
  • [41] An Automatic System for Real-Time Identifying Atrial Fibrillation by Using a Lightweight Convolutional Neural Network
    Lai, Dakun
    Zhang, Xinshu
    Bu, Yuxiang
    Su, Ye
    Ma, Chang-Sheng
    IEEE ACCESS, 2019, 7 : 130074 - 130084
  • [42] REAL-TIME UNSUPERVISED MULTI-VIEW DEPTH ESTIMATION NETWORK FOR VIRTUAL VIEW SYNTHESIS
    Qiu, Ke
    Gu, Song
    Liu, Shiyi
    Lai, Yawen
    Cai, Yangang
    Wang, Ronggang
    2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [43] STRUCTURE GENERATION AND GUIDANCE NETWORK FOR UNSUPERVISED MONOCULAR DEPTH ESTIMATION
    Wang, Chaoqun
    Chen, Xuejin
    Min, Shaobo
    Wu, Feng
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1264 - 1269
  • [44] CNNapsule: A Lightweight Network with Fusion Features for Monocular Depth Estimation
    Wang, Yinchu
    Zhu, Haijiang
    Liu, Mengze
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT I, 2021, 12891 : 507 - 518
  • [45] LW-Net: A Lightweight Network for Monocular Depth Estimation
    Feng, Cheng
    Zhang, Congxuan
    Chen, Zhen
    Li, Ming
    Chen, Hao
    Fan, Bingbing
    IEEE ACCESS, 2020, 8 : 196287 - 196298
  • [46] Depth estimation for monocular image based on convolutional neural networks
    Niu B.
    Tang M.
    Chen X.
    International Journal of Circuits, Systems and Signal Processing, 2021, 15 : 533 - 540
  • [47] Real-time Depth Enhanced Monocular Odometry
    Zhang, Ji
    Kaess, Michael
    Singh, Sanjiv
    2014 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2014), 2014, : 4973 - 4980
  • [48] Lightweight Convolutional Neural Network for Real-Time Face Detector on CPU Supporting Interaction of Service Robot
    Putro, Muhamad Dwisnanto
    Duy-Linh Nguyen
    Jo, Kang-Hyun
    2020 13TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION (HSI), 2020, : 94 - 99
  • [49] Real-Time Self-Supervised Monocular Depth Estimation Without GPU
    Poggi, Matteo
    Tosi, Fabio
    Aleotti, Filippo
    Mattoccia, Stefano
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (10) : 17342 - 17353
  • [50] Monocular depth estimation with geometrical guidance using a multi-level convolutional neural network
    Amirkolaee, Hamed Amini
    Arefi, Hossein
    APPLIED SOFT COMPUTING, 2019, 84