Aerial Monocular 3D Object Detection

被引：12

作者：

Hu, Yue ^{[1
]}

Fang, Shaoheng ^{[1
]}

Xie, Weidi ^{[1
,2
]}

Chen, Siheng ^{[1
,2
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Cooperat Medianet Innovat Ctr CMIC, Shanghai 200240, Peoples R China

[2] Shanghai AI Lab, Shanghai 200240, Peoples R China

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2023年 / 8卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Three-dimensional displays; Object detection; Drones; Deformation; Feature extraction; Autonomous vehicles; Task analysis; Aerial systems; Perception and autonomy;

D O I：

10.1109/LRA.2023.3245421

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Drones equipped with cameras can significantly enhance human's ability to perceive the world because of their remarkable maneuverability in 3D space. Ironically, object detection for drones has always been conducted in the 2D image space, which fundamentally limits their ability to understand 3D scenes. Furthermore, existing 3D object detection methods developed for autonomous driving cannot be directly applied to drones due to the lack of deformation modeling, which is essential for the distant aerial perspective with sensitive distortion and small objects. To fill the gap, this work proposes a dual-view detection system named DVDET to achieve aerial monocular object detection in both the 2D image space and the 3D physical space. To address the severe view deformation issue, we propose a novel trainable geo-deformable transformation module that can properly warp information from the drone's perspective to the birds' eye view (BEV). Compared to the monocular methods for cars, our transformation includes a learnable deformable network for explicitly revising the severe deviation. To address the dataset challenge, we propose a new large-scale simulation dataset named AM3D-Sim, and a new real-world aerial dataset named AM3D-Real with high-quality annotations for 3D object detection. Extensive experiments show that i) aerial monocular 3D object detection is feasible; ii) the model pre-trained on the simulation dataset helps real-world performance; and iii) DVDET also helps monocular 3D object detection for cars. To encourage more researchers to investigate this area, we released the dataset and related code.

引用

页码：1959 / 1966

页数：8

共 50 条

[1] Disentangling Monocular 3D Object Detection
Simonelli, Andrea
Bulo, Samuel Rota
Porzi, Lorenzo
Lopez-Antequera, Manuel
Kontschieder, Peter
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1991 - 1999
[2] Monocular 3D Object Detection for Autonomous Driving
Chen, Xiaozhi
Kundu, Kaustav
Zhang, Ziyu
Ma, Huimin
Fidler, Sanja
Urtasun, Raquel
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2147 - 2156
[3] Dimension Embeddings for Monocular 3D Object Detection
Zhang, Yunpeng
Zheng, Wenzhao
Zhu, Zheng
Huang, Guan
Du, Dalong
Zhou, Jie
Lu, Jiwen
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1579 - 1588
[4] Learning Occupancy for Monocular 3D Object Detection
Peng, Liang
Xu, Junkai
Cheng, Haoran
Yang, Zheng
Wu, Xiaopei
Qian, Wei
Wang, Wenxiao
Wu, Boxi
Cai, Deng
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 10281 - 10292
[5] Uncertainty Prediction for Monocular 3D Object Detection
Mun, Junghwan
Choi, Hyukdoo
SENSORS, 2023, 23 (12)
[6] Multivariate Probabilistic Monocular 3D Object Detection
Shi, Xuepeng
Chen, Zhixiang
Kim, Tae-Kyun
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4270 - 4279
[7] Homography Loss for Monocular 3D Object Detection
Gu, Jiaqi
Wu, Bojian
Fan, Lubin
Huang, Jianqiang
Cao, Shen
Xiang, Zhiyu
Hua, Xian-Sheng
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1070 - 1079
[8] Monocular 3D object detection for distant objects
Li, Jiahao
Han, Xiaohong
JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (03) : 33021
[9] Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection
Liu, Xianpeng
Xue, Nan
Wu, Tianfu
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1810 - 1818
[10] Monocular 3D Object Detection with Bounding Box Denoising in 3D by Perceiver
Liu, Xianpeng
Zheng, Ce
Cheng, Kelvin
Xue, Nan
Qi, Guo-Jun
Wu, Tianfu
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6413 - 6423

← 1 2 3 4 5 →