Real-Time Monocular Depth Estimation Merging Vision Transformers on Edge Devices for AIoT

被引:2
|
作者
Liu, Xihao [1 ]
Wei, Wei [1 ]
Liu, Cheng [1 ]
Peng, Yuyang [2 ]
Huang, Jinhao [1 ]
Li, Jun [1 ]
机构
[1] Guangzhou Univ, Sch Elect & Commun Engn, Guangzhou 510006, Peoples R China
[2] Macau Univ Sci & Technol, Sch Comp Sci & Engn, Macau, Peoples R China
关键词
Estimation; Semantics; Real-time systems; Transformers; Feature extraction; Decoding; Task analysis; Artificial intelligence of things (AIoT); attention; monocular depth estimation; real-time; transformers;
D O I
10.1109/TIM.2023.3264039
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Depth estimation is requisite to build the 3-D perceiving capability of artificial intelligence of things (AIoT). Real-time inference with extremely low computing resource consumption is critical on edge devices. However, most single-view depth estimation networks focus on the improvement of accuracy when running on high-end GPUs, which goes against the real-time requirement on edge devices. To address this issue, this article proposed a novel encoder-decoder network to realize real-time monocular depth estimation on edge devices. The proposed network merges semantic information at the global field via an efficient transformer-based module to provide more details of the object for depth assignment. The transformer-based module is integrated into the lowest level resolution of an encoder-decoder architecture to largely reduce the parameters of the vision transformer (ViT). In particular, we proposed a novel patch convolutional layer for low-latency feature extraction in the encoder and an SConv5 layer for effective depth assignment in the decoder. The proposed network achieves an outstanding balance between the accuracy and speed of the NYU Depth v2 dataset. A low root mean square error (RMSE) of 0.554 and a fast speed of 58.98 FPS on NVIDIA Jetson Nano device with TensorRT optimization are obtained on NYU Depth v2, outperforming most state-of-the-art real-time results.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] RT-ViT: Real-Time Monocular Depth Estimation Using Lightweight Vision Transformers
    Ibrahem, Hatem
    Salem, Ahmed
    Kang, Hyun-Soo
    SENSORS, 2022, 22 (10)
  • [2] Lightweight Monocular Depth Estimation on Edge Devices
    Liu, Siping
    Yang, Laurence Tianruo
    Tu, Xiaohan
    Li, Renfa
    Xu, Cheng
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (17) : 16168 - 16180
  • [3] FasterMDE: A real-time monocular depth estimation search method that balances accuracy and speed on the edge
    Dou ZiWen
    Li YuQi
    Ye Dong
    Applied Intelligence, 2023, 53 : 24566 - 24586
  • [4] FasterMDE: A real-time monocular depth estimation search method that balances accuracy and speed on the edge
    ZiWen, Dou
    YuQi, Li
    Dong, Ye
    APPLIED INTELLIGENCE, 2023, 53 (20) : 24566 - 24586
  • [5] On the robustness of vision transformers for in-flight monocular depth estimation
    Simone Ercolino
    Alessio Devoto
    Luca Monorchio
    Matteo Santini
    Silvio Mazzaro
    Simone Scardapane
    Industrial Artificial Intelligence, 1 (1):
  • [6] Real-time monocular depth estimation on embedded devices: challenges and performances in terrestrial and underwater scenarios
    Papa, Lorenzo
    Russo, Paolo
    Amerini, Irene
    2022 IEEE INTERNATIONAL WORKSHOP ON METROLOGY FOR THE SEA LEARNING TO MEASURE SEA HEALTH PARAMETERS (METROSEA), 2022, : 50 - 55
  • [7] Towards Real-Time Monocular Depth Estimation For Mobile Systems
    Deldjoo, Yashar
    Di Noia, Tommaso
    Di Sciascio, Eugenio
    Pernisco, Gaetano
    Reno, Vito
    Stella, Ettore
    MULTIMODAL SENSING AND ARTIFICIAL INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS II, 2021, 11785
  • [8] Real-Time Depth Estimation from a Monocular Moving Camera
    Handa, Aniket
    Sharma, Prateek
    CONTEMPORARY COMPUTING, 2012, 306 : 494 - 495
  • [9] OptiDepthNet: A Real-Time Unsupervised Monocular Depth Estimation Network
    Wei, Feng
    Yin, XingHui
    Shen, Jie
    Wang, HuiBin
    WIRELESS PERSONAL COMMUNICATIONS, 2023, 128 (04) : 2831 - 2846
  • [10] Towards real-time unsupervised monocular depth estimation on CPU
    Poggi, Matteo
    Aleotti, Filippo
    Tosi, Fabio
    Mattoccia, Stefano
    2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 5848 - 5854