FastMDE: A Fast CNN Architecture for Monocular Depth Estimation at High Resolution

被引：6

作者：

Thien-Thanh Dao ^{[1
]}

Quoc-Viet Pham ^{[2
]}

Hwang, Won-Joo ^{[3
]}

机构：

[1] Pusan Natl Univ, Dept Comp Engn, Yangsan Si 50612, South Korea

[2] Pusan Natl Univ, Korean Southeast Ctr 4th Ind Revolut Leader Educ, Busan 46241, South Korea

[3] Pusan Natl Univ, Dept Biomed Convergence Engn, Yangsan Si 50612, South Korea

来源：

IEEE ACCESS | 2022年 / 10卷

基金：

新加坡国家研究基金会;

关键词：

Estimation; Semantics; Decoding; Convolutional neural networks; Computer architecture; Computational modeling; Unsupervised learning; Efficient CNN; deep neural network; depth map; supervised learning; self-supervised learning; IMAGE; NETWORK;

D O I：

10.1109/ACCESS.2022.3145969

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A depth map helps robots and autonomous vehicles (AVs) visualize the three-dimensional world to navigate and localize neighboring obstacles. However, it is difficult to develop a deep learning model that can estimate the depth map from a single image in real-time. This study proposes a fast monocular depth estimation model named FastMDE by optimizing the deep convolutional neural network according to the encoder-decoder architecture. The decoder needs to obtain partial and semantic feature maps from the encoding phase to improve the depth estimation accuracy. Therefore, we designed FastMDE with two effective strategies. The first one involved redesigning the skip connection with the features of the squeeze-excitation module to obtain partial and semantic feature maps of the encoding phase. The second strategy involved redesigning the decoder by using the fusion dense block to permit the usage of high-resolution features that were learned earlier in the network before upsampling. The proposed FastMDE model utilizes only 4.1 M parameters, which is much lesser than the parameters utilized by state-of-art models. Thus, FastDME has a higher accuracy and lower latency than previous models. This study also demonstrates that MDE can leverage deep neural networks in real-time (i.e., 30 fps) with the Linux embedded board Nvidia Jetson Xavier NX. The model can facilitate the development and applications with superior performances and easy deployment on an embedded platform.

引用

页码：16111 / 16122

页数：12

共 50 条

[21] RA-Depth: Resolution Adaptive Self-supervised Monocular Depth Estimation
He, Mu
Hui, Le
Bian, Yikai
Ren, Jian
Xie, Jin
Yang, Jian
COMPUTER VISION - ECCV 2022, PT XXVII, 2022, 13687 : 565 - 581
[22] High quality monocular depth estimation with parallel decoder
Jiatao Liu
Yaping Zhang
Scientific Reports, 12
[23] High quality monocular depth estimation with parallel decoder
Liu, Jiatao
Zhang, Yaping
SCIENTIFIC REPORTS, 2022, 12 (01)
[24] Knowledge Distillation for Fast and Accurate Monocular Depth Estimation on Mobile Devices
Wang, Yiran
Li, Xingyi
Shi, Min
Xian, Ke
Cao, Zhiguo
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2457 - 2465
[25] SharpNet: Fast and Accurate Recovery of Occluding Contours in Monocular Depth Estimation
Ramamonjisoa, Michael
Lepetit, Vincent
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2109 - 2118
[26] Repmono: a lightweight self-supervised monocular depth estimation architecture for high-speed inference
Zhang, Guowei
Tang, Xincheng
Wang, Li
Cui, Huankang
Fei, Teng
Tang, Hulin
Jiang, Shangfeng
COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (06) : 7927 - 7941
[27] Lightweight Self-Supervised Monocular Depth Estimation Through CNN and Transformer Integration
Wang, Zhe
Zou, Yongjia
Lv, Jin
Cao, Yang
Yu, Hongfei
IEEE ACCESS, 2024, 12 : 167934 - 167943
[28] A Self-Supervised Monocular Depth Estimation Method Based on High Resolution Convolutional Neural Network
Pu, Zhengdong
Chen, Shu
Zou, Beiji
Pu, Baoxing
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (01): : 118 - 127
[29] The Monocular Depth Estimation Challenge
Spencer, Jaime
Qian, C. Stella
Russell, Chris
Hadfield, Simon
Graf, Erich
Adams, Wendy
Schofield, Andrew J.
Elder, James
Bowden, Richard
Cong, Heng
Mattoccia, Stefano
Poggi, Matteo
Suri, Zeeshan Khan
Tang, Yang
Tosi, Fabio
Wang, Hao
Zhang, Youmin
Zhang, Yusheng
Zhao, Chaoqiang
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW), 2023, : 623 - 632
[30] Perceptual Monocular Depth Estimation
Pan, Janice
Bovik, Alan C.
NEURAL PROCESSING LETTERS, 2021, 53 (02) : 1205 - 1228

← 1 2 3 4 5 →