Cross-Modal Knowledge Distillation for Depth Privileged Monocular Visual Odometry

被引:4
|
作者
Li, Bin [1 ]
Wang, Shuling [1 ]
Ye, Haifeng [1 ]
Gong, Xiaojin [1 ]
Xiang, Zhiyu [1 ]
机构
[1] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou 310027, Zhejiang, Peoples R China
关键词
Deep learning methods; knowledge distillation; localization; visual odometry;
D O I
10.1109/LRA.2022.3166457
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Most self-supervised monocular visual odometry (VO) suffer from the scale ambiguity problem. A promising way to address this problem is to introduce additional information for training. In this work, we propose a new depth privileged framework to learn a monocular VO. It assumes that sparse depth is provided during training time but not available at the test stage. To make full use of the privileged depth information, we propose a cross-modal knowledge distillation method, which utilizes a well-trained visual-lidar odometry (VLO) as a teacher to guide the training of the VO network. Knowledge distillation is conducted at both output and hint levels. Besides, a distillation condition check is also designed to leave out the noise that may be contained in the teacher's predictions. Experiments on the KITTI odometry benchmark show that the proposed method produces accurate pose estimation results with a recovered actual scale. It also outperforms most stereo privileged monocular VOs.
引用
收藏
页码:6171 / 6178
页数:8
相关论文
共 50 条
  • [1] DistilVPR: Cross-Modal Knowledge Distillation for Visual Place Recognition
    Wang, Sijie
    She, Rui
    Kang, Qiyu
    Jian, Xingchao
    Zhao, Kai
    Song, Yang
    Tay, Wee Peng
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 9, 2024, : 10377 - 10385
  • [2] CroMo: Cross-Modal Learning for Monocular Depth Estimation
    Verdie, Yannick
    Song, Jifei
    Mas, Barnabe
    Busam, Benjamin
    Leonardis, Ales
    McDonagh, Steven
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3927 - 3937
  • [3] CROSS-MODAL KNOWLEDGE DISTILLATION FOR ACTION RECOGNITION
    Thoker, Fida Mohammad
    Gall, Juergen
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 6 - 10
  • [4] Visual-to-EEG cross-modal knowledge distillation for continuous emotion recognition
    Zhang, Su
    Tang, Chuangao
    Guan, Cuntai
    [J]. PATTERN RECOGNITION, 2022, 130
  • [5] Visual-to-EEG cross-modal knowledge distillation for continuous emotion recognition
    Zhang, Su
    Tang, Chuangao
    Guan, Cuntai
    [J]. PATTERN RECOGNITION, 2022, 130
  • [6] Acoustic NLOS Imaging with Cross-Modal Knowledge Distillation
    Shin, Ui-Hyeon
    Jang, Seungwoo
    Kim, Kwangsu
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1405 - 1413
  • [7] Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal Retrieval
    Li, Mingyong
    Wang, Hongya
    [J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 183 - 191
  • [8] Depth Prediction for Monocular Direct Visual Odometry
    Cheng, Ran
    Agia, Christopher
    Meger, David
    Dudek, Gregory
    [J]. 2020 17TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV 2020), 2020, : 70 - 77
  • [9] Cross-Modal Knowledge Distillation with Dropout-Based Confidence
    Cho, Won Ik
    Kim, Jeunghun
    Kim, Nam Soo
    [J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 653 - 657
  • [10] Semi-Supervised Knowledge Distillation for Cross-Modal Hashing
    Su, Mingyue
    Gu, Guanghua
    Ren, Xianlong
    Fu, Hao
    Zhao, Yao
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 662 - 675