Self-Supervised Monocular Depth Estimation Method for Joint Semantic Segmentation

被引:0
|
作者
Song X. [1 ,2 ]
Hu H. [1 ]
Ning J. [1 ]
Liang L. [1 ]
Lu X. [1 ,2 ]
Hei X. [1 ,2 ]
机构
[1] School of Computer Science and Engineering, Xi'an University of Technology, Xi'an
[2] Human Machine Integration Intelligent Robot Shaanxi Provincial University Engineering Research Center (Xi'an University of Technology), Xi'an
基金
中国国家自然科学基金;
关键词
depth estimation; multi-task association; self-supervised deep learning; semantic segmentation; shared encoder;
D O I
10.7544/issn1000-1239.202330485
中图分类号
学科分类号
摘要
In this paper, the mutually beneficial relationship between depth estimation and semantic segmentation is investigated, and a self-supervised monocular depth estimation method for joint semantic segmentation USegDepth is proposed. The shared encoder for semantic segmentation and depth estimation is implemented to achieve semantic guidance. To further improve the across multiple tasks performance of the encoder, a multi-task feature extraction module is designed. The module is stacked to generate the shared encoder, solving the poor feature representation problem of the model due to limited receptive field and lack of cross-channel interaction, and the model accuracy is improved further. And a cross-task interaction module is proposed for bidirectional cross-domain information interaction to refine the depth features, improving depth estimation performance, especially in weak texture regions and object boundaries with limited luminosity consistency supervision. Through training and evaluation on KITTI dataset, the experimental results show that the mean square relative error of USegDepth is reduced by 0.176 percentage points compared with that of SGDepth, and the threshold accuracy reaches 98.4% at a threshold value of 1.253, proving the high accuracy of USegDepth in depth prediction. © 2024 Science Press. All rights reserved.
引用
收藏
页码:1336 / 1347
页数:11
相关论文
共 36 条
  • [1] Jiang Junjun, Li Zhenyu, Liu Xianming, Overview of monocular depth estimation methods based on deep learning, Chinese Journal of Computers, 45, 6, pp. 1276-1307, (2022)
  • [2] Cheng Xinjing, Wang Peng, Yang Ruigang, Learning depth with convolutional spatial propagation network, IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 10, pp. 2361-2379, (2019)
  • [3] Godard C, Aodha O M, Brostow G J., Unsupervised monocular depth estimation with left-right consistency, Proc of the 35th IEEE Conf on Computer Vision and Pattern Recognition (CVPR), pp. 6602-6611, (2017)
  • [4] Imran S, Long Yunfei, Liu Xiaoming, Et al., Depth coefficients for depth completion, Proc of the 37th IEEE Conf on Computer Vision and Pattern Recognition (CVPR), pp. 12438-12447, (2019)
  • [5] Guo Xiaoyang, Li Hongsheng, Yi Shuai, Et al., Learning monocular depth by distilling cross-domain stereo networks, Proc of the 15th European Conf on Computer Vision (ECCV), pp. 484-500, (2018)
  • [6] Luo Yue, Ren J, Lin M, Et al., Single view stereo matching, Proc of the 36th IEEE Conf on Computer Vision and Pattern Recognition (CVPR), pp. 155-163, (2018)
  • [7] Shu Chang, Yu Kun, Duan Zhixiang, Et al., Feature-metric loss for self-supervised learning of depth and egomotion, Proc of the 16th European Conf on Computer Vision (ECCV), pp. 572-588, (2020)
  • [8] Yin Zhichao, Shi Jianping, GeoNet: Unsupervised learning of dense depth, optical flow and camera pose, Proc of the 36th IEEE Conf on Computer Vision and Pattern Recognition (CVPR), pp. 1983-1992, (2018)
  • [9] Qi Xiaojun, Liao Renjie, Liu Zhengzhe, Et al., GeoNet: Geometric neural network for joint depth and surface normal estimation, Proc of the 36th IEEE Conf on Computer Vision and Pattern Recognition (CVPR), pp. 283-291, (2018)
  • [10] Saha S, Obukhov A, Paudel P D, Et al., Learning to relate depth and semantics for unsupervised domain adaptation, Proc of the 39th IEEE Conf on Computer Vision and Pattern Recognition (CVPR), pp. 8193-8203, (2021)