Online supervised attention-based recurrent depth estimation from monocular video

被引:0
|
作者
Maslov D. [1 ]
Makarov I. [1 ,2 ]
机构
[1] School of Data Analysis and Artificial Intelligence, HSE University, Moscow
[2] Samsung-PDMI Joint AI Center, St. Petersburg Department of Steklov Institute of Mathematics, St. Petersburg
来源
Maslov, Dmitrii (dvmaslov@edu.hse.ru) | 1600年 / PeerJ Inc.卷 / 06期
关键词
Augmented Reality; Autonomous Vehicles; Computer Science Methods; Computer Vision; Deep Convolutional Neural Networks; Depth Reconstruction; Recurrent Neural Networks;
D O I
10.7717/PEERJ-CS.317
中图分类号
学科分类号
摘要
Autonomous driving highly depends on depth information for safe driving. Recently, major improvements have been taken towards improving both supervised and self-supervised methods for depth reconstruction. However, most of the current approaches focus on single frame depth estimation, where quality limit is hard to beat due to limitations of supervised learning of deep neural networks in general. One of the way to improve quality of existing methods is to utilize temporal information from frame sequences. In this paper, we study intelligent ways of integrating recurrent block in common supervised depth estimation pipeline. We propose a novel method, which takes advantage of the convolutional gated recurrent unit (convGRU) and convolutional long short-term memory (convLSTM). We compare use of convGRU and convLSTM blocks and determine the best model for real-time depth estimation task. We carefully study training strategy and provide new deep neural networks architectures for the task of depth estimation from monocular video using information from past frames based on attention mechanism. We demonstrate the efficiency of exploiting temporal information by comparing our best recurrent method with existing image-based and video-based solutions for monocular depth reconstruction. © 2020. Maslov and Makarov. All Rights Reserved.
引用
收藏
页码:1 / 22
页数:21
相关论文
共 50 条
  • [31] Transferring knowledge from monocular completion for self-supervised monocular depth estimation
    Lin Sun
    Yi Li
    Bingzheng Liu
    Liying Xu
    Zhe Zhang
    Jie Zhu
    Multimedia Tools and Applications, 2022, 81 : 42485 - 42495
  • [32] Bootstrapped Self-Supervised Training with Monocular Video for Semantic Segmentation and Depth Estimation
    Zhang, Yihao
    Leonard, John J.
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 2420 - 2427
  • [33] Lightweight monocular absolute depth estimation based on attention mechanism
    Jin, Jiayu
    Tao, Bo
    Qian, Xinbo
    Hu, Jiaxin
    Li, Gongfa
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (02)
  • [34] DAttNet: monocular depth estimation network based on attention mechanisms
    Astudillo, Armando
    Barrera, Alejandro
    Guindel, Carlos
    Al-Kaff, Abdulla
    Garcia, Fernando
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (07): : 3347 - 3356
  • [35] Radar Fusion Monocular Depth Estimation Based on Dual Attention
    Long, JianYu
    Huang, JinGui
    Wang, ShengChun
    ARTIFICIAL INTELLIGENCE AND SECURITY, ICAIS 2022, PT I, 2022, 13338 : 166 - 179
  • [36] DAttNet: monocular depth estimation network based on attention mechanisms
    Armando Astudillo
    Alejandro Barrera
    Carlos Guindel
    Abdulla Al-Kaff
    Fernando García
    Neural Computing and Applications, 2024, 36 : 3347 - 3356
  • [37] Attention-based deep supervised hashing for near duplicate video retrieval
    Shi, Naifei
    Fu, Chong
    Tie, Ming
    Zhang, Wenchao
    Wang, Xingwei
    Sham, Chiu-Wing
    NEURAL COMPUTING & APPLICATIONS, 2023, 36 (10): : 5217 - 5230
  • [38] PAR-mono: monocular video depth estimation network based on channel separation and dynamic attention
    Li, Hongyan
    Zhang, Ziyang
    Hao, Zhaoming
    Xu, Baoqing
    Wang, Weifeng
    Sun, Jing
    MULTIMEDIA SYSTEMS, 2025, 31 (01)
  • [39] Event-Based Monocular Depth Estimation With Recurrent Transformers
    Liu, Xu
    Li, Jianing
    Shi, Jinqiao
    Fan, Xiaopeng
    Tian, Yonghong
    Zhao, Debin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 7417 - 7429
  • [40] Attention-based deep supervised hashing for near duplicate video retrieval
    Naifei Shi
    Chong Fu
    Ming Tie
    Wenchao Zhang
    Xingwei Wang
    Chiu-Wing Sham
    Neural Computing and Applications, 2024, 36 : 5217 - 5230