Aerial Monocular 3D Object Detection

被引:12
|
作者
Hu, Yue [1 ]
Fang, Shaoheng [1 ]
Xie, Weidi [1 ,2 ]
Chen, Siheng [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Cooperat Medianet Innovat Ctr CMIC, Shanghai 200240, Peoples R China
[2] Shanghai AI Lab, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Three-dimensional displays; Object detection; Drones; Deformation; Feature extraction; Autonomous vehicles; Task analysis; Aerial systems; Perception and autonomy;
D O I
10.1109/LRA.2023.3245421
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Drones equipped with cameras can significantly enhance human's ability to perceive the world because of their remarkable maneuverability in 3D space. Ironically, object detection for drones has always been conducted in the 2D image space, which fundamentally limits their ability to understand 3D scenes. Furthermore, existing 3D object detection methods developed for autonomous driving cannot be directly applied to drones due to the lack of deformation modeling, which is essential for the distant aerial perspective with sensitive distortion and small objects. To fill the gap, this work proposes a dual-view detection system named DVDET to achieve aerial monocular object detection in both the 2D image space and the 3D physical space. To address the severe view deformation issue, we propose a novel trainable geo-deformable transformation module that can properly warp information from the drone's perspective to the birds' eye view (BEV). Compared to the monocular methods for cars, our transformation includes a learnable deformable network for explicitly revising the severe deviation. To address the dataset challenge, we propose a new large-scale simulation dataset named AM3D-Sim, and a new real-world aerial dataset named AM3D-Real with high-quality annotations for 3D object detection. Extensive experiments show that i) aerial monocular 3D object detection is feasible; ii) the model pre-trained on the simulation dataset helps real-world performance; and iii) DVDET also helps monocular 3D object detection for cars. To encourage more researchers to investigate this area, we released the dataset and related code.
引用
收藏
页码:1959 / 1966
页数:8
相关论文
共 50 条
  • [1] Disentangling Monocular 3D Object Detection
    Simonelli, Andrea
    Bulo, Samuel Rota
    Porzi, Lorenzo
    Lopez-Antequera, Manuel
    Kontschieder, Peter
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1991 - 1999
  • [2] Monocular 3D Object Detection for Autonomous Driving
    Chen, Xiaozhi
    Kundu, Kaustav
    Zhang, Ziyu
    Ma, Huimin
    Fidler, Sanja
    Urtasun, Raquel
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2147 - 2156
  • [3] Dimension Embeddings for Monocular 3D Object Detection
    Zhang, Yunpeng
    Zheng, Wenzhao
    Zhu, Zheng
    Huang, Guan
    Du, Dalong
    Zhou, Jie
    Lu, Jiwen
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1579 - 1588
  • [4] Learning Occupancy for Monocular 3D Object Detection
    Peng, Liang
    Xu, Junkai
    Cheng, Haoran
    Yang, Zheng
    Wu, Xiaopei
    Qian, Wei
    Wang, Wenxiao
    Wu, Boxi
    Cai, Deng
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 10281 - 10292
  • [5] Uncertainty Prediction for Monocular 3D Object Detection
    Mun, Junghwan
    Choi, Hyukdoo
    SENSORS, 2023, 23 (12)
  • [6] Multivariate Probabilistic Monocular 3D Object Detection
    Shi, Xuepeng
    Chen, Zhixiang
    Kim, Tae-Kyun
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4270 - 4279
  • [7] Homography Loss for Monocular 3D Object Detection
    Gu, Jiaqi
    Wu, Bojian
    Fan, Lubin
    Huang, Jianqiang
    Cao, Shen
    Xiang, Zhiyu
    Hua, Xian-Sheng
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1070 - 1079
  • [8] Monocular 3D object detection for distant objects
    Li, Jiahao
    Han, Xiaohong
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (03) : 33021
  • [9] Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection
    Liu, Xianpeng
    Xue, Nan
    Wu, Tianfu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1810 - 1818
  • [10] Monocular 3D Object Detection with Bounding Box Denoising in 3D by Perceiver
    Liu, Xianpeng
    Zheng, Ce
    Cheng, Kelvin
    Xue, Nan
    Qi, Guo-Jun
    Wu, Tianfu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6413 - 6423