Selective Transfer Learning of Cross-Modality Distillation for Monocular 3D Object Detection

被引:1
|
作者
Ding, Rui [1 ]
Yang, Meng [1 ]
Zheng, Nanning [1 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian 710049, Peoples R China
基金
美国国家科学基金会;
关键词
Three-dimensional displays; Laser radar; Uncertainty; Object detection; Feature extraction; Estimation; Knowledge engineering; 3D object detection; depth estimation; cross-modality; knowledge distillation; selective transfer;
D O I
10.1109/TCSVT.2024.3405992
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Monocular 3D object detection is a promising yet ill-posed task for autonomous vehicles due to the lack of accurate depth information. Cross-modality knowledge distillation could effectively transfer depth information from LiDAR to image-based network. However, modality gap between image and LiDAR seriously limits its accuracy. In this paper, we systematically investigate the negative transfer problem induced by modality gap in cross-modality distillation for the first time, including not only the architecture inconsistency issue but more importantly the feature overfitting issue. We propose a selective learning approach named MonoSTL to overcome these issues, which encourages positive transfer of depth information from LiDAR while alleviates the negative transfer on image-based network. On the one hand, we utilize similar architectures to ensure spatial alignment of features between image-based and LiDAR-based networks. On the other hand, we develop two novel distillation modules, namely Depth-Aware Selective Feature Distillation (DASFD) and Depth-Aware Selective Relation Distillation (DASRD), which selectively learn positive features and relationships of objects by integrating depth uncertainty into feature and relation distillations, respectively. Our approach can be seamlessly integrated into various CNN-based and DETR-based models, where we take three recent models on KITTI and a recent model on NuScenes for validation. Extensive experiments show that our approach considerably improves the accuracy of the base models and thereby achieves the best accuracy compared with all recently released SOTA models. The code is released on https://github.com/DingCodeLab/MonoSTL.
引用
收藏
页码:9925 / 9938
页数:14
相关论文
共 50 条
  • [41] Learning Cross-Modality High-Resolution Representation for Thermal Small-Object Detection
    Zhang, Yan
    Lei, Xu
    Hu, Qian
    Xu, Chang
    Yang, Wen
    Xia, Gui-Song
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [42] IMPLICIT LEARNING - WITHIN-MODALITY AND CROSS-MODALITY TRANSFER OF TACIT KNOWLEDGE
    MANZA, L
    REBER, AS
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1991, 29 (06) : 499 - 499
  • [43] Double cross-modality progressively guided network for RGB-D salient object detection
    Yao, Cuili
    Feng, Lin
    Kong, Yuqiu
    Li, Shengming
    Li, Hang
    IMAGE AND VISION COMPUTING, 2022, 117
  • [44] A Survey on Deep Learning Based Methods and Datasets for Monocular 3D Object Detection
    Kim, Seong-heum
    Hwang, Youngbae
    ELECTRONICS, 2021, 10 (04) : 1 - 22
  • [45] Attention-Based Depth Distillation with 3D-Aware Positional Encoding for Monocular 3D Object Detection
    Wu, Zizhang
    Wu, Yunzhe
    Pu, Jian
    Li, Xianzhi
    Wang, Xiaoquan
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 2892 - 2900
  • [46] Coral Classification Using DenseNet and Cross-modality Transfer Learning
    Xu, Lian
    Bennamoun, Mohammed
    Boussaid, Farid
    Ana, Senjian
    Sohel, Ferdous
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [47] CIGAR: Cross-Modality Graph Reasoning for Domain Adaptive Object Detection
    Liu, Yabo
    Wang, Jinghua
    Huang, Chao
    Wang, Yaowei
    Xu, Yong
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23776 - 23786
  • [48] RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation
    Zhang, Haiming
    Yan, Xu
    Bai, Dongfeng
    Gao, Jiantao
    Wang, Pan
    Liu, Bingbing
    Cui, Shuguang
    Li, Zhen
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7060 - 7068
  • [49] itKD: Interchange Transfer-based Knowledge Distillation for 3D Object Detection
    Cho, Hyeon
    Choi, Junyong
    Baek, Geonwoo
    Hwang, Wonjun
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 13540 - 13549
  • [50] Monocular 3D Object Detection with Bounding Box Denoising in 3D by Perceiver
    Liu, Xianpeng
    Zheng, Ce
    Cheng, Kelvin
    Xue, Nan
    Qi, Guo-Jun
    Wu, Tianfu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6413 - 6423