MiniNet: An extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation

被引:25
|
作者
Liu, Jun [1 ,2 ,3 ]
Li, Qing [1 ,2 ,3 ]
Cao, Rui [1 ,2 ,3 ]
Tang, Wenming [1 ,2 ,3 ]
Qiu, Guoping [1 ,2 ,3 ,4 ]
机构
[1] Shenzhen Univ, Coll Elect & Informat Engn, Shenzhen, Peoples R China
[2] Shenzhen Univ, Guangdong Key Lab Intelligent Informat Proc, Shenzhen, Peoples R China
[3] Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen, Peoples R China
[4] Univ Nottingham, Sch Comp Sci, Nottingham, England
关键词
Monocular depth estimation; Convolutional neural network; Unsupervised learning; Lightweight; Real-time; INDOOR;
D O I
10.1016/j.isprsjprs.2020.06.004
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
Predicting depth from a single image is an attractive research topic since it provides one more dimension of information to enable machines to better perceive the world. Recently, deep learning has emerged as an effective approach to monocular depth estimation. As obtaining labeled data is costly, there is a recent trend to move from supervised learning to unsupervised learning to obtain monocular depth. However, most unsupervised learning methods capable of achieving high depth prediction accuracy will require a deep network architecture which will be too heavy and complex to run on embedded devices with limited storage and memory spaces. To address this issue, we propose a new powerful network with a recurrent module to achieve the capability of a deep network while at the same time maintaining an extremely lightweight size for real-time high performance unsupervised monocular depth prediction from video sequences. Besides, a novel efficient upsample block is proposed to fuse the features from the associated encoder layer and recover the spatial size of features with the small number of model parameters. We validate the effectiveness of our approach via extensive experiments on the KITTI dataset. Our new model can run at a speed of about 110 frames per second (fps) on a single GPU, 37 fps on a single CPU, and 2 fps on a Raspberry Pi 3. Moreover, it achieves higher depth accuracy with nearly 33 times fewer model parameters than state-of-the-art models. To the best of our knowledge, this work is the first extremely lightweight neural network trained on monocular video sequences for real-time unsupervised monocular depth estimation, which opens up the possibility of implementing deep learning-based real-time unsupervised monocular depth prediction on low-cost embedded devices.
引用
收藏
页码:255 / 267
页数:13
相关论文
共 50 条
  • [31] Lightweight Monocular Depth Estimation with an Edge Guided Network
    Dong, Xingshuai
    Garratt, Matthew A.
    Anavatti, Sreenatha G.
    Abbass, Hussein A.
    Dong, Junyu
    2022 17TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2022, : 204 - 210
  • [32] Visualization of Convolutional Neural Networks for Monocular Depth Estimation
    Hu, Junjie
    Zhang, Yan
    Okatani, Takayuki
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3868 - 3877
  • [33] A real-time railway fastener inspection method using the lightweight depth estimation network
    Zhong, Haoyu
    Liu, Long
    Wang, Jie
    Fu, Qinyi
    Yi, Bing
    MEASUREMENT, 2022, 189
  • [34] LACTNet: A Lightweight Real-Time Semantic Segmentation Network Based on an Aggregated Convolutional Neural Network and Transformer
    Zhang, Xiangyue
    Li, Hexiao
    Ru, Jingyu
    Ji, Peng
    Wu, Chengdong
    ELECTRONICS, 2024, 13 (12)
  • [35] Unsupervised Depth Estimation from Light Field Using a Convolutional Neural Network
    Peng, Jiayong
    Xiong, Zhiwei
    Liu, Dong
    Chen, Xuejin
    2018 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2018, : 295 - 303
  • [36] A Lightweight and Dynamic Convolutional Network for Real-time Semantic Segmentation
    Zhang, Chunyu
    Xu, Fang
    Wu, Chengdong
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 4062 - 4067
  • [37] Exploiting Motion Perception in Depth Estimation through a Lightweight Convolutional Neural Network
    Leite, Pedro Nuno
    Pinto, Andry Maykol
    IEEE Access, 2021, 9 : 76056 - 76068
  • [38] Exploiting Motion Perception in Depth Estimation Through a Lightweight Convolutional Neural Network
    Leite, Pedro Nuno
    Pinto, Andry Maykol
    IEEE ACCESS, 2021, 9 : 76056 - 76068
  • [39] Real-Time Monocular Human Depth Estimation and Segmentation on Embedded Systems
    An, Shan
    Zhou, Fangru
    Yang, Mei
    Zhu, Haogang
    Fu, Changhong
    Tsintotas, Konstantinos A.
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 55 - 62
  • [40] Comparing Monocular Camera Depth Estimation Models for Real-time Applications
    Diab, Abdelrahman
    Sabry, Mohamed
    El Mougy, Amr
    ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3, 2022, : 673 - 680