Real-Time Monocular Depth Estimation Merging Vision Transformers on Edge Devices for AIoT

被引：2

作者：

Liu, Xihao ^{[1
]}

Wei, Wei ^{[1
]}

Liu, Cheng ^{[1
]}

Peng, Yuyang ^{[2
]}

Huang, Jinhao ^{[1
]}

Li, Jun ^{[1
]}

机构：

[1] Guangzhou Univ, Sch Elect & Commun Engn, Guangzhou 510006, Peoples R China

[2] Macau Univ Sci & Technol, Sch Comp Sci & Engn, Macau, Peoples R China

来源：

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT | 2023年 / 72卷

关键词：

Estimation; Semantics; Real-time systems; Transformers; Feature extraction; Decoding; Task analysis; Artificial intelligence of things (AIoT); attention; monocular depth estimation; real-time; transformers;

D O I：

10.1109/TIM.2023.3264039

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Depth estimation is requisite to build the 3-D perceiving capability of artificial intelligence of things (AIoT). Real-time inference with extremely low computing resource consumption is critical on edge devices. However, most single-view depth estimation networks focus on the improvement of accuracy when running on high-end GPUs, which goes against the real-time requirement on edge devices. To address this issue, this article proposed a novel encoder-decoder network to realize real-time monocular depth estimation on edge devices. The proposed network merges semantic information at the global field via an efficient transformer-based module to provide more details of the object for depth assignment. The transformer-based module is integrated into the lowest level resolution of an encoder-decoder architecture to largely reduce the parameters of the vision transformer (ViT). In particular, we proposed a novel patch convolutional layer for low-latency feature extraction in the encoder and an SConv5 layer for effective depth assignment in the decoder. The proposed network achieves an outstanding balance between the accuracy and speed of the NYU Depth v2 dataset. A low root mean square error (RMSE) of 0.554 and a fast speed of 58.98 FPS on NVIDIA Jetson Nano device with TensorRT optimization are obtained on NYU Depth v2, outperforming most state-of-the-art real-time results.

引用

页数：9

共 50 条

[1] RT-ViT: Real-Time Monocular Depth Estimation Using Lightweight Vision Transformers
Ibrahem, Hatem
Salem, Ahmed
Kang, Hyun-Soo
SENSORS, 2022, 22 (10)
[2] Lightweight Monocular Depth Estimation on Edge Devices
Liu, Siping
Yang, Laurence Tianruo
Tu, Xiaohan
Li, Renfa
Xu, Cheng
IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (17) : 16168 - 16180
[3] FasterMDE: A real-time monocular depth estimation search method that balances accuracy and speed on the edge
Dou ZiWen
Li YuQi
Ye Dong
Applied Intelligence, 2023, 53 : 24566 - 24586
[4] FasterMDE: A real-time monocular depth estimation search method that balances accuracy and speed on the edge
ZiWen, Dou
YuQi, Li
Dong, Ye
APPLIED INTELLIGENCE, 2023, 53 (20) : 24566 - 24586
[5] On the robustness of vision transformers for in-flight monocular depth estimation
Simone Ercolino
Alessio Devoto
Luca Monorchio
Matteo Santini
Silvio Mazzaro
Simone Scardapane
Industrial Artificial Intelligence, 1 (1):
[6] Real-time monocular depth estimation on embedded devices: challenges and performances in terrestrial and underwater scenarios
Papa, Lorenzo
Russo, Paolo
Amerini, Irene
2022 IEEE INTERNATIONAL WORKSHOP ON METROLOGY FOR THE SEA LEARNING TO MEASURE SEA HEALTH PARAMETERS (METROSEA), 2022, : 50 - 55
[7] Towards Real-Time Monocular Depth Estimation For Mobile Systems
Deldjoo, Yashar
Di Noia, Tommaso
Di Sciascio, Eugenio
Pernisco, Gaetano
Reno, Vito
Stella, Ettore
MULTIMODAL SENSING AND ARTIFICIAL INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS II, 2021, 11785
[8] Real-Time Depth Estimation from a Monocular Moving Camera
Handa, Aniket
Sharma, Prateek
CONTEMPORARY COMPUTING, 2012, 306 : 494 - 495
[9] OptiDepthNet: A Real-Time Unsupervised Monocular Depth Estimation Network
Wei, Feng
Yin, XingHui
Shen, Jie
Wang, HuiBin
WIRELESS PERSONAL COMMUNICATIONS, 2023, 128 (04) : 2831 - 2846
[10] Towards real-time unsupervised monocular depth estimation on CPU
Poggi, Matteo
Aleotti, Filippo
Tosi, Fabio
Mattoccia, Stefano
2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 5848 - 5854

← 1 2 3 4 5 →