SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds

被引:27
|
作者
Sun, Pei [1 ]
Tan, Mingxing [1 ]
Wang, Weiyue [1 ]
Liu, Chenxi [1 ]
Xia, Fei [1 ]
Leng, Zhaoqi [1 ]
Anguelov, Dragomir [1 ]
机构
[1] Waymo LLC, Palo Alto, CA 94301 USA
来源
关键词
D O I
10.1007/978-3-031-20080-9_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D object detection in point clouds is a core component for modern robotics and autonomous driving systems. A key challenge in 3D object detection comes from the inherent sparse nature of point occupancy within the 3D scene. In this paper, we propose Sparse Window Transformer (SWFormer), a scalable and accurate model for 3D object detection, which can take full advantage of the sparsity of point clouds. Built upon the idea of window-based Transformers, SWFormer converts 3D points into sparse voxels and windows, and then processes these variable-length sparse windows efficiently using a bucketing scheme. In addition to self-attention within each spatial window, our SWFormer also captures cross-window correlation with multi-scale feature fusion and window shifting operations. To further address the unique challenge of detecting 3D objects accurately from sparse features, we propose a new voxel diffusion technique. Experimental results on the Waymo Open Dataset show our SWFormer achieves state-of-the-art 73.36 L2 mAPH on vehicle and pedestrian for 3D object detection on the official test set, outperforming all previous single-stage and two-stage models, while being much more efficient.
引用
收藏
页码:426 / 442
页数:17
相关论文
共 50 条
  • [1] Weakly Supervised Point Clouds Transformer for 3D Object Detection
    Tang, Zuojin
    Sun, Bo
    Ma, Tongwei
    Li, Daosheng
    Xu, Zhenhui
    [J]. 2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2022, : 3948 - 3955
  • [2] Clusterformer: Cluster-based Transformer for 3D Object Detection in Point Clouds
    Pei, Yu
    Zhao, Xian
    Li, Hao
    Ma, Jingyuan
    Zhang, Jingwei
    Pu, Shiliang
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6641 - 6650
  • [3] DVST: Deformable Voxel Set Transformer for 3D Object Detection from Point Clouds
    Ning, Yaqian
    Cao, Jie
    Bao, Chun
    Hao, Qun
    [J]. REMOTE SENSING, 2023, 15 (23)
  • [4] Transformer for 3D Point Clouds
    Wang, Jiayun
    Chakraborty, Rudrasis
    Yu, Stella X.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (08) : 4419 - 4431
  • [5] 3D Object Detection Algorithm Based on the Reconstruction of Sparse Point Clouds in the Viewing Frustum
    Xu, Xing
    Wu, Xiang
    Zhao, Yun
    Lue, Xiaoshu
    Aapaoja, Aki
    [J]. MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [6] 3D Siamese Transformer Network for Single Object Tracking on Point Clouds
    Hui, Le
    Wang, Lingpeng
    Tang, Linghua
    Lan, Kaihao
    Xie, Jin
    Yang, Jian
    [J]. COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 : 293 - 310
  • [7] Knowledge guided object detection and identification in 3D Point Clouds
    Karmacharya, A.
    Boochs, F.
    Tietz, B.
    [J]. VIDEOMETRICS, RANGE IMAGING, AND APPLICATIONS XIII, 2015, 9528
  • [8] Deep Hough Voting for 3D Object Detection in Point Clouds
    Qi, Charles R.
    Litany, Or
    He, Kaiming
    Guibas, Leonidas J.
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9276 - 9285
  • [9] 3D Object Detection with Normal-map on Point Clouds
    Miao, Jishu
    Hirakawa, Tsubasa
    Yamashita, Takayoshi
    Fujiyoshi, Hironobu
    [J]. VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, 2021, : 569 - 576
  • [10] Boundary points guided 3D object detection for point clouds
    Tang, Qingsong
    Yang, Mingzhi
    Wang, Ziyi
    Dong, Wenhao
    Liu, Yang
    [J]. APPLIED SOFT COMPUTING, 2024, 165