FocalFormer3D: Focusing on Hard Instance for 3D Object Detection

被引:8
|
作者
Chen, Yilun [1 ]
Yu, Zhiding [3 ]
Chen, Yukang [1 ]
Lan, Shiyi [3 ]
Anandkumar, Anima [2 ,3 ]
Jia, Jiaya [1 ]
Alvarez, Jose M.
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] CALTECH, Pasadena, CA USA
[3] NVIDIA, Santa Clara, CA USA
关键词
D O I
10.1109/ICCV51070.2023.00771
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
False negatives (FN) in 3D object detection, e.g., missing predictions of pedestrians, vehicles, or other obstacles, can lead to potentially dangerous situations in autonomous driving. While being fatal, this issue is understudied in many current 3D detection methods. In this work, we propose Hard Instance Probing (HIP), a general pipeline that identifies FN in a multi- stage manner and guides the models to focus on excavating difficult instances. For 3D object detection, we instantiate this method as FocalFormer3D, a simple yet effective detector that excels at excavating difficult objects and improving prediction recall. FocalFormer3D features a multi-stage query generation to discover hard objects and a box-level transformer decoder to efficiently distinguish objects from massive object candidates. Experimental results on the nuScenes and Waymo datasets validate the superior performance of FocalFormer3D. The advantage leads to strong performance on both detection and tracking, in both LiDAR and multi-modal settings. Notably, FocalFormer3D achieves a 70.5 mAP and 73.9 NDS on nuScenes detection benchmark, while the nuScenes tracking benchmark shows 72.1 AMOTA, both ranking 1st place on the nuScenes LiDAR leaderboard. Our code is available at https: //github.com/NVlabs/FocalFormer3D.
引用
下载
收藏
页码:8360 / 8371
页数:12
相关论文
共 50 条
  • [21] Disentangling Monocular 3D Object Detection
    Simonelli, Andrea
    Bulo, Samuel Rota
    Porzi, Lorenzo
    Lopez-Antequera, Manuel
    Kontschieder, Peter
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1991 - 1999
  • [22] Aerial Monocular 3D Object Detection
    Hu, Yue
    Fang, Shaoheng
    Xie, Weidi
    Chen, Siheng
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) : 1959 - 1966
  • [23] Rotationally Equivariant 3D Object Detection
    Yu, Hong-Xing
    Wu, Jiajun
    Yi, Li
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1446 - 1454
  • [24] Voxel Transformer for 3D Object Detection
    Mao, Jiageng
    Xue, Yujing
    Niu, Minzhe
    Bai, Haoyue
    Feng, Jiashi
    Liang, Xiaodan
    Xu, Hang
    Xu, Chunjing
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3144 - 3153
  • [25] 3D Object Detection with Multiple Kinects
    Susanto, Wandi
    Rohrbach, Marcus
    Schiele, Bernt
    COMPUTER VISION - ECCV 2012, PT II, 2012, 7584 : 93 - 102
  • [26] Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous Vehicles
    Srivastava, Siddharth
    Jurie, Frederic
    Sharma, Gaurav
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 4504 - 4511
  • [27] 3D Reconstruction and Object Detection for HoloLens
    Wu, Zequn
    Zhao, Tianhao
    Nguyen, Chuong
    2020 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2020,
  • [28] Fully Sparse 3D Object Detection
    Fan, Lue
    Wang, Feng
    Wang, Naiyan
    Zhang, Zhaoxiang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [29] Lifting Object Detection Datasets into 3D
    Carreira, Joao
    Vicente, Sara
    Agapito, Lourdes
    Batista, Jorge
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (07) : 1342 - 1355
  • [30] Super Sparse 3D Object Detection
    Fan, Lue
    Yang, Yuxue
    Wang, Feng
    Wang, Naiyan
    Zhang, Zhaoxiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12490 - 12505