A Point Transformer Accelerator With Distribution-Aware Heuristic Distance Calculation

被引:0
|
作者
Lian, Yaoxiu [1 ]
Yang, Xinhao [2 ]
Hong, Ke [2 ]
Wang, Yu [2 ]
Xu, Ningyi [1 ]
Dai, Guohao [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China
[2] Tsinghua Univ, Dept Elect & Comp Engn, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Point cloud compression; Transformers; Three-dimensional displays; Feature extraction; Task analysis; Neural networks; Accuracy; Farthest point sampling (FPS); k-nearest neighbors (kNNs); point cloud; transformer;
D O I
10.1109/TCAD.2024.3445262
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Point clouds are an important form of 3-D data used in applications, such as computer vision and autonomous driving, but the irregular and disordered nature of point clouds makes processing them severely challenging. Recently, point-based neural networks for point clouds have been widely used in various 3-D applications. Notably, transformer-based models have demonstrated state-of-the-art accuracy. However, three significant challenges exist: 1) data interdependence hinders parallel execution in networks like Point Transformer; 2) the farthest point sampling (FPS) involves redundant memory access and computational overhead; and 3) intermediate results require repetitive memory access and calculations between FPS and K-nearest neighbor (kNN) operators. This limits Point Transformer's processing speed to 17.80 frames/s on NVIDIA Jetson Orin, below the real-time requirement of around 30 frames/s. In this article, we introduce PTrAcc++, an innovative point transformer accelerator to address the aforementioned three challenges from the following three levels. On the computation graph level, our investigation reveals that the Point Transformer's performance suffers minimal degradation when operating within a constrained receptive field. Leveraging this insight, PTrAcc++ strategically frees the MaxPool and attention-kNN layers, along with their associated data dependencies, achieving an inconsequential loss in accuracy. On the operator level, we identify that the variability for distance computation among accessed points during FPS iterations contributes to redundant memory accesses and computational overhead. PTrAcc++ proposes a distribution-aware heuristic for distance calculation to minimize unnecessary memory accesses and computational redundancies within the FPS operator. On the architecture level, we recognize that the transition down process (encompassing FPS and kNN operations) constitutes 71.77% of the total inference time, PTrAcc++ proposes an integrated FPS-kNN architecture to select error-driven k neighbors, reducing repeated memory accesses and distance recalculations of intermediate results. Through extensive experimentation, PTrAcc++ demonstrates remarkable performance improvements, achieving end-to-end speedups of up to 2.96x, 1.70x , and 1.19x when compared to the state-of-the-art acceleratorsPointAcc (Lin et al., 2021), MARS (Yang et al., 2023), and PTrAcc (Lian et al., 2023), respectively, across a variety of point cloud neural networks.
引用
收藏
页码:751 / 764
页数:14
相关论文
共 50 条
  • [1] A Point Transformer Accelerator with Fine-Grained Pipelines and Distribution-Aware Dynamic FPS
    Lian, Yaoxiu
    Yang, Xinhao
    Hong, Ke
    Wang, Yu
    Dai, Guohao
    Xu, Ningyi
    2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
  • [2] DDA-Net: Deep Distribution-Aware Network for Point Cloud Compression
    Ahn, Junghyun
    Pang, Jiahao
    Lodhi, Muhammad Asad
    Tian, Dong
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [3] Not All Neighbors Matter: Point Distribution-Aware Pruning for 3D Point Cloud
    Lee, Yejin
    Lee, Donghyun
    Hong, JungUk
    Lee, Jae W.
    Yoon, Hongil
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 1240 - 1249
  • [4] Distribution-Aware Sampling of Answer Sets
    Nickles, Matthias
    SCALABLE UNCERTAINTY MANAGEMENT (SUM 2018), 2018, 11142 : 164 - 180
  • [5] Distribution-Aware Crowdsourced Entity Collection
    Fan, Ju
    Wei, Zhewei
    Zhang, Dongxiang
    Yang, Jingru
    Du, Xiaoyong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (07) : 1312 - 1326
  • [6] Distribution-aware fairness test generation
    Rajan, Sai Sathiesh
    Soremekun, Ezekiel
    Le Traon, Yves
    Chattopadhyay, Sudipta
    JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 215
  • [7] Hierarchical Distribution-aware Testing of Deep Learning
    Huang, Wei
    Zhao, Xingyu
    Banks, Alec
    Cox, Victoria
    Huang, Xiaowei
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (02)
  • [8] Distribution-Aware Replay for Continual MRI Segmentation
    Lemke, Nick
    Gonzalez, Camila
    Mukhopadhyay, Anirban
    Mundt, Martin
    ARTIFICIAL INTELLIGENCE IN PANCREATIC DISEASE DETECTION AND DIAGNOSIS, AND PERSONALIZED INCREMENTAL LEARNING IN MEDICINE, AIPAD 2024, PILM 2024, 2025, 15197 : 73 - 85
  • [9] Unsupervised distribution-aware keypoints generation from 3D point clouds
    Wu, Yiqi
    Chen, Xingye
    Huang, Xuan
    Song, Kelin
    Zhang, Dejun
    NEURAL NETWORKS, 2024, 173
  • [10] Illumination Distribution-Aware Thermal Pedestrian Detection
    Li, Songtao
    Ye, Mao
    Ji, Luping
    Tang, Song
    Gan, Yan
    Zhu, Xiatian
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (11) : 18688 - 18700