MiniNet: An extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation

被引:25
|
作者
Liu, Jun [1 ,2 ,3 ]
Li, Qing [1 ,2 ,3 ]
Cao, Rui [1 ,2 ,3 ]
Tang, Wenming [1 ,2 ,3 ]
Qiu, Guoping [1 ,2 ,3 ,4 ]
机构
[1] Shenzhen Univ, Coll Elect & Informat Engn, Shenzhen, Peoples R China
[2] Shenzhen Univ, Guangdong Key Lab Intelligent Informat Proc, Shenzhen, Peoples R China
[3] Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen, Peoples R China
[4] Univ Nottingham, Sch Comp Sci, Nottingham, England
关键词
Monocular depth estimation; Convolutional neural network; Unsupervised learning; Lightweight; Real-time; INDOOR;
D O I
10.1016/j.isprsjprs.2020.06.004
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
Predicting depth from a single image is an attractive research topic since it provides one more dimension of information to enable machines to better perceive the world. Recently, deep learning has emerged as an effective approach to monocular depth estimation. As obtaining labeled data is costly, there is a recent trend to move from supervised learning to unsupervised learning to obtain monocular depth. However, most unsupervised learning methods capable of achieving high depth prediction accuracy will require a deep network architecture which will be too heavy and complex to run on embedded devices with limited storage and memory spaces. To address this issue, we propose a new powerful network with a recurrent module to achieve the capability of a deep network while at the same time maintaining an extremely lightweight size for real-time high performance unsupervised monocular depth prediction from video sequences. Besides, a novel efficient upsample block is proposed to fuse the features from the associated encoder layer and recover the spatial size of features with the small number of model parameters. We validate the effectiveness of our approach via extensive experiments on the KITTI dataset. Our new model can run at a speed of about 110 frames per second (fps) on a single GPU, 37 fps on a single CPU, and 2 fps on a Raspberry Pi 3. Moreover, it achieves higher depth accuracy with nearly 33 times fewer model parameters than state-of-the-art models. To the best of our knowledge, this work is the first extremely lightweight neural network trained on monocular video sequences for real-time unsupervised monocular depth estimation, which opens up the possibility of implementing deep learning-based real-time unsupervised monocular depth prediction on low-cost embedded devices.
引用
下载
收藏
页码:255 / 267
页数:13
相关论文
共 50 条
  • [1] OptiDepthNet: A Real-Time Unsupervised Monocular Depth Estimation Network
    Wei, Feng
    Yin, XingHui
    Shen, Jie
    Wang, HuiBin
    WIRELESS PERSONAL COMMUNICATIONS, 2023, 128 (04) : 2831 - 2846
  • [2] OptiDepthNet: A Real-Time Unsupervised Monocular Depth Estimation Network
    Feng Wei
    XingHui Yin
    Jie Shen
    HuiBin Wang
    Wireless Personal Communications, 2023, 128 : 2831 - 2846
  • [3] Real-time Monocular Depth Estimation with Extremely Light-Weight Neural Network
    Chiu, Mian-Jhong
    Chiu, Wei-Chen
    Chen, Hua-Tsung
    Chuang, Jen-Hui
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7050 - 7057
  • [4] Towards real-time unsupervised monocular depth estimation on CPU
    Poggi, Matteo
    Aleotti, Filippo
    Tosi, Fabio
    Mattoccia, Stefano
    2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 5848 - 5854
  • [5] MBUDepthNet: Real-Time Unsupervised Monocular Depth Estimation Method for Outdoor Scenes
    Bian, Zhekai
    Wang, Xia
    Liu, Qiwei
    Lv, Shuaijun
    Wei, Ranfeng
    IEEE ACCESS, 2024, 12 : 63598 - 63609
  • [6] LD-Net: A Lightweight Network for Real-Time Self-Supervised Monocular Depth Estimation
    Xiong, Mingkang
    Zhang, Zhenghong
    Zhang, Tao
    Xiong, Huilin
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 882 - 886
  • [7] Attention based multilayer feature fusion convolutional neural network for unsupervised monocular depth estimation
    Lei, Zeyu
    Wang, Yan
    Li, Zijian
    Yang, Junyao
    NEUROCOMPUTING, 2021, 423 : 343 - 352
  • [8] Unsupervised Monocular Depth Estimation by Fusing Dilated Convolutional Network and SLAM
    Dai Renyue
    Fang Zhijun
    Gao Yongbin
    LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (06)
  • [9] MobileXNet: An Efficient Convolutional Neural Network for Monocular Depth Estimation
    Dong, Xingshuai
    Garratt, Matthew A.
    Anavatti, Sreenatha G.
    Abbass, Hussein A.
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) : 20134 - 20147
  • [10] FDDWNET: A LIGHTWEIGHT CONVOLUTIONAL NEURAL NETWORK FOR REAL-TIME SEMANTIC SEGMENTATION
    Liu, Jia
    Zhou, Quan
    Qiang, Yong
    Kang, Bin
    Wu, Xiaofu
    Zheng, Baoyu
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2373 - 2377