FocalFormer3D: Focusing on Hard Instance for 3D Object Detection

被引：8

作者：

Chen, Yilun ^{[1
]}

Yu, Zhiding ^{[3
]}

Chen, Yukang ^{[1
]}

Lan, Shiyi ^{[3
]}

Anandkumar, Anima ^{[2
,3
]}

Jia, Jiaya ^{[1
]}

Alvarez, Jose M.

机构：

[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China

[2] CALTECH, Pasadena, CA USA

[3] NVIDIA, Santa Clara, CA USA

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年

关键词：

D O I：

10.1109/ICCV51070.2023.00771

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

False negatives (FN) in 3D object detection, e.g., missing predictions of pedestrians, vehicles, or other obstacles, can lead to potentially dangerous situations in autonomous driving. While being fatal, this issue is understudied in many current 3D detection methods. In this work, we propose Hard Instance Probing (HIP), a general pipeline that identifies FN in a multi- stage manner and guides the models to focus on excavating difficult instances. For 3D object detection, we instantiate this method as FocalFormer3D, a simple yet effective detector that excels at excavating difficult objects and improving prediction recall. FocalFormer3D features a multi-stage query generation to discover hard objects and a box-level transformer decoder to efficiently distinguish objects from massive object candidates. Experimental results on the nuScenes and Waymo datasets validate the superior performance of FocalFormer3D. The advantage leads to strong performance on both detection and tracking, in both LiDAR and multi-modal settings. Notably, FocalFormer3D achieves a 70.5 mAP and 73.9 NDS on nuScenes detection benchmark, while the nuScenes tracking benchmark shows 72.1 AMOTA, both ranking 1st place on the nuScenes LiDAR leaderboard. Our code is available at https: //github.com/NVlabs/FocalFormer3D.

引用

下载

页码：8360 / 8371

页数：12

共 50 条

[21] Disentangling Monocular 3D Object Detection
Simonelli, Andrea
Bulo, Samuel Rota
Porzi, Lorenzo
Lopez-Antequera, Manuel
Kontschieder, Peter
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1991 - 1999
[22] Aerial Monocular 3D Object Detection
Hu, Yue
Fang, Shaoheng
Xie, Weidi
Chen, Siheng
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) : 1959 - 1966
[23] Rotationally Equivariant 3D Object Detection
Yu, Hong-Xing
Wu, Jiajun
Yi, Li
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1446 - 1454
[24] Voxel Transformer for 3D Object Detection
Mao, Jiageng
Xue, Yujing
Niu, Minzhe
Bai, Haoyue
Feng, Jiashi
Liang, Xiaodan
Xu, Hang
Xu, Chunjing
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3144 - 3153
[25] 3D Object Detection with Multiple Kinects
Susanto, Wandi
Rohrbach, Marcus
Schiele, Bernt
COMPUTER VISION - ECCV 2012, PT II, 2012, 7584 : 93 - 102
[26] Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous Vehicles
Srivastava, Siddharth
Jurie, Frederic
Sharma, Gaurav
2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 4504 - 4511
[27] 3D Reconstruction and Object Detection for HoloLens
Wu, Zequn
Zhao, Tianhao
Nguyen, Chuong
2020 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2020,
[28] Fully Sparse 3D Object Detection
Fan, Lue
Wang, Feng
Wang, Naiyan
Zhang, Zhaoxiang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[29] Lifting Object Detection Datasets into 3D
Carreira, Joao
Vicente, Sara
Agapito, Lourdes
Batista, Jorge
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (07) : 1342 - 1355
[30] Super Sparse 3D Object Detection
Fan, Lue
Yang, Yuxue
Wang, Feng
Wang, Naiyan
Zhang, Zhaoxiang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12490 - 12505

← 1 2 3 4 5 →