A Coarse-to-Fine Framework for Point Voxel Transformer

被引:0
|
作者
Bai, Zhuhua [1 ]
Meng, Fantong [1 ]
Li, Weiqing [1 ]
Kang, Renke [1 ]
Yang, Guolin [1 ]
Dong, Zhigang [1 ]
机构
[1] Dalian Univ Technol, Dalian, Peoples R China
关键词
3D vision; PVT; Coarse-to-Fine; Coarse-grained; Important Voxel Identification; Fine-grained;
D O I
10.1109/CSCWD61410.2024.10580279
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
To effectively solve the problem that the input point clouds in the traditional point voxel transformer model (PVT) appear to be quite redundant in spatial dimensions, which causes massive computation and memory costs, we propose a novel coarse-to-fine point voxel transformer framework(CF-PVT) to relieve computation and memory burden while retaining performance. Our CF-PVT implements network inference in a two-stage manner. In the coarse inference stage, the input point cloud is split into coarse-grained voxels for economic computation. If it cannot be identified well, important voxels containing rich information are identified by the Important Voxel Identification Module and further split into fine-grained voxels. We conduct extensive experiments on traditional classification and segmentation tasks. The experiments demonstrate that our CF-PVT framework is highly effective. For example, while maintaining similar accuracy, CF-PVT reduces 60.1% FLOPs, and 68.9% latency of PVT1 on the ModelNet40 dataset.
引用
收藏
页码:205 / 211
页数:7
相关论文
共 50 条
  • [1] Point cloud upsampling via a coarse-to-fine network with transformer-encoder
    Li, Yixi
    Liu, Yanzhe
    Chen, Rong
    Li, Hui
    Zhao, Na
    VISUAL COMPUTER, 2025, 41 (04): : 2323 - 2337
  • [2] Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction
    Cai, Yuanhao
    Lin, Jing
    Hu, Xiaowan
    Wang, Haoqian
    Yuan, Xin
    Zhang, Yulun
    Timofte, Radu
    Van Gool, Luc
    COMPUTER VISION - ECCV 2022, PT XVII, 2022, 13677 : 686 - 704
  • [3] A coarse-to-fine framework to efficiently thwart plagiarism
    Zhang, Haijun
    Chow, Tommy W. S.
    PATTERN RECOGNITION, 2011, 44 (02) : 471 - 487
  • [4] A Coarse-to-Fine Framework for Automatic Video Unscreen
    Rao, Anyi
    Xu, Linning
    Li, Zhizhong
    Huang, Qingqiu
    Kuang, Zhanghui
    Zhang, Wayne
    Lin, Dahua
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2723 - 2733
  • [5] A coarse-to-fine deformable contour optimization framework
    Akgul, YS
    Kambhamettu, C
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2003, 25 (02) : 174 - 186
  • [6] Cascade Coarse-to-Fine Point-Query Transformer for RGB-T Crowd Counting
    Qu, Xian
    Yang, Yingyi
    Mai, Xiaoming
    PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 67 - 83
  • [7] Affine Medical Image Registration with Coarse-to-Fine Vision Transformer
    Mok, Tony C. W.
    Chung, Albert C. S.
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20803 - 20812
  • [8] Multi-Scale Coarse-to-Fine Transformer for Frame Interpolation
    Li, Chen
    Song, Li
    Zou, Xueyi
    Guo, Jiaming
    Yan, Youliang
    Zhang, Wenjun
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5201 - 5209
  • [9] Point Cloud Upsampling via a Coarse-to-Fine Network
    Wang, Yingrui
    Wang, Suyu
    Sun, Longhua
    MULTIMEDIA MODELING (MMM 2022), PT I, 2022, 13141 : 467 - 478
  • [10] A Coarse-to-Fine Framework for Resource Efficient Video Recognition
    Wu, Zuxuan
    Li, Hengduo
    Zheng, Yingbin
    Xiong, Caiming
    Jiang, Yu-Gang
    Davis, Larry S.
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (11) : 2965 - 2977