FastMDE: A Fast CNN Architecture for Monocular Depth Estimation at High Resolution

被引：6

作者：

Thien-Thanh Dao ^{[1
]}

Quoc-Viet Pham ^{[2
]}

Hwang, Won-Joo ^{[3
]}

机构：

[1] Pusan Natl Univ, Dept Comp Engn, Yangsan Si 50612, South Korea

[2] Pusan Natl Univ, Korean Southeast Ctr 4th Ind Revolut Leader Educ, Busan 46241, South Korea

[3] Pusan Natl Univ, Dept Biomed Convergence Engn, Yangsan Si 50612, South Korea

来源：

IEEE ACCESS | 2022年 / 10卷

基金：

新加坡国家研究基金会;

关键词：

Estimation; Semantics; Decoding; Convolutional neural networks; Computer architecture; Computational modeling; Unsupervised learning; Efficient CNN; deep neural network; depth map; supervised learning; self-supervised learning; IMAGE; NETWORK;

D O I：

10.1109/ACCESS.2022.3145969

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A depth map helps robots and autonomous vehicles (AVs) visualize the three-dimensional world to navigate and localize neighboring obstacles. However, it is difficult to develop a deep learning model that can estimate the depth map from a single image in real-time. This study proposes a fast monocular depth estimation model named FastMDE by optimizing the deep convolutional neural network according to the encoder-decoder architecture. The decoder needs to obtain partial and semantic feature maps from the encoding phase to improve the depth estimation accuracy. Therefore, we designed FastMDE with two effective strategies. The first one involved redesigning the skip connection with the features of the squeeze-excitation module to obtain partial and semantic feature maps of the encoding phase. The second strategy involved redesigning the decoder by using the fusion dense block to permit the usage of high-resolution features that were learned earlier in the network before upsampling. The proposed FastMDE model utilizes only 4.1 M parameters, which is much lesser than the parameters utilized by state-of-art models. Thus, FastDME has a higher accuracy and lower latency than previous models. This study also demonstrates that MDE can leverage deep neural networks in real-time (i.e., 30 fps) with the Linux embedded board Nvidia Jetson Xavier NX. The model can facilitate the development and applications with superior performances and easy deployment on an embedded platform.

引用

页码：16111 / 16122

页数：12

共 50 条

[31] BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and Monocular Depth Estimation
Tang, Qi
Cong, Runmin
Sheng, Ronghui
He, Lingzhi
Zhang, Dan
Zhao, Yao
Kwong, Sam
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2148 - 2157
[32] Perceptual Monocular Depth Estimation
Janice Pan
Alan C. Bovik
Neural Processing Letters, 2021, 53 : 1205 - 1228
[33] DFRNets: Unsupervised Monocular Depth Estimation Using a Siamese Architecture for Disparity Refinement
Yusiong, John Paul Tan
Naval, Prospero Clara, Jr.
PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY, 2020, 28 (01): : 163 - 177
[34] Sparse depth densification for monocular depth estimation
Zhen Liang
Tiyu Fang
Yanzhu Hu
Yingjian Wang
Multimedia Tools and Applications, 2024, 83 : 14821 - 14838
[35] Depth Map Decomposition for Monocular Depth Estimation
Jun, Jinyoung
Lee, Jae-Han
Lee, Chul
Kim, Chang-Su
COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 : 18 - 34
[36] Sparse depth densification for monocular depth estimation
Liang, Zhen
Fang, Tiyu
Hu, Yanzhu
Wang, Yingjian
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (05) : 14821 - 14838
[37] Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging
Miangoleh, S. Mahdi H.
Dille, Sebastian
Mai, Long
Paris, Sylvain
Aksoy, Yagiz
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9680 - 9689
[38] Fast CNN Stereo Depth Estimation through Embedded GPU Devices
Aguilera, Cristhian A.
Aguilera, Cristhian
Navarro, Cristobal A.
Sappa, Angel D.
SENSORS, 2020, 20 (11) : 1 - 13
[39] Fast Robust Monocular Depth Estimation for Obstacle Detection with Fully Convolutional Networks
Mancini, Michele
Costante, Gabriele
Valigi, Paolo
Ciarfuglia, Thomas A.
2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), 2016, : 4296 - 4303
[40] Multi-resolution distillation for self-supervised monocular depth estimation
Lee, Sebin
Im, Woobin
Yoon, Sung-Eui
PATTERN RECOGNITION LETTERS, 2023, 176 : 215 - 222

← 1 2 3 4 5 →