SimPB: A Single Model for 2D and 3D Object Detection from Multiple Cameras

被引:0
|
作者
Tang, Yingqi [1 ]
Meng, Zhaotie [1 ]
Chen, Guoliang [1 ]
Cheng, Erkang [1 ]
机构
[1] Nuilmax, Shanghai, Peoples R China
来源
关键词
Autonomous Driving; 3D Object Detection; Transformer;
D O I
10.1007/978-3-031-72627-9_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The field of autonomous driving has attracted considerable interest in approaches that directly infer 3D objects in the Bird's Eye View (BEV) from multiple cameras. Some attempts have also explored utilizing 2D detectors from single images to enhance the performance of 3D detection. However, these approaches rely on a two-stage process with separate detectors, where the 2D detection results are utilized only once for token selection or query initialization. In this paper, we present a single model termed SimPB, which Simultaneously detects 2D objects in the Perspective view and 3D objects in the BEV space from multiple cameras. To achieve this, we introduce a hybrid decoder consisting of several multi-view 2D decoder layers and several 3D decoder layers, specifically designed for their respective detection tasks. A Dynamic Query Allocation module and an Adaptive Query Aggregation module are proposed to continuously update and refine the interaction between 2D and 3D results, in a cyclic 3D-2D-3D manner. Additionally, Query-group Attention is utilized to strengthen the interaction among 2D queries within each camera group. In the experiments, we evaluate our method on the nuScenes dataset and demonstrate promising results for both 2D and 3D detection tasks. Our code is available at: https://github.com/nullmax-vision/SimPB.
引用
收藏
页码:1 / 17
页数:17
相关论文
共 50 条
  • [21] Hiding Depth Map of An Object in Its 2D Image: Reversible Watermarking for 3D Cameras
    Khan, Asifullah
    Mahmood, M. Tariq
    Ali, Asad
    Usman, Imran
    Choi, Tae-Sun
    2009 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, 2009, : 243 - 244
  • [22] Design in 2D, model in 3D: Live 3D pose generation from 2D sketches
    Tosco, Paolo
    Mackey, Mark
    Cheeseright, Tim
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 258
  • [23] Combining 2D and 3D features to improve road detection based on stereo cameras
    Cai, Guorong
    Su, Songzhi
    He, Wenli
    Wu, Yundong
    Li, Shaozi
    IET COMPUTER VISION, 2018, 12 (06) : 834 - 843
  • [24] 3D and 2D/3D holograms model
    A. A. Boriskevich
    V. K. Erohovets
    V. V. Tkachenko
    Optical Memory and Neural Networks, 2012, 21 (4) : 242 - 248
  • [25] A 2D/3D model-based object tracking framework
    Polat, E
    Yeasin, M
    Sharma, R
    PATTERN RECOGNITION, 2003, 36 (09) : 2127 - 2141
  • [26] MODEL BASED RECOGNITION OF 3D OBJECTS FROM SINGLE 2D IMAGES
    ROMSOM, EP
    DUIN, RPW
    INTELLIGENT AUTONOMOUS SYSTEMS 2, VOLS 1 AND 2, 1989, : 853 - 863
  • [27] A Joint Model for 2D and 3D Pose Estimation from a Single Image
    Simo-Serra, E.
    Quattoni, A.
    Torras, C.
    Moreno-Noguer, F.
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 3634 - 3641
  • [28] 3D Object Detection with Multiple Kinects
    Susanto, Wandi
    Rohrbach, Marcus
    Schiele, Bernt
    COMPUTER VISION - ECCV 2012, PT II, 2012, 7584 : 93 - 102
  • [29] 3D object reconstruction from single 2D line drawings without hidden lines
    Suzuki, Harumi
    Shoji, Kenji
    Nagai, Masahiro
    Toyama, Fubito
    Miyamichi, Juichi
    IDW '07: PROCEEDINGS OF THE 14TH INTERNATIONAL DISPLAY WORKSHOPS, VOLS 1-3, 2007, : 1247 - 1250
  • [30] Disentangling Deep Network for Reconstructing 3D Object Shapes from Single 2D Images
    Yang, Yang
    Han, Junwei
    Zhang, Dingwen
    Cheng, De
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 153 - 166