Point Clouds are Specialized Images: A Knowledge Transfer Approach for 3D Understanding

被引:0
|
作者
Kang, Jiachen [1 ]
Jia, Wenjing [1 ]
He, Xiangjian [2 ]
Lam, Kin Man [3 ]
机构
[1] Univ Technol Sydney, Sch Elect & Data Engn, Sydney, NSW 2007, Australia
[2] Univ Nottingham Ningbo, Sch Comp Sci, Ningbo 315100, Peoples R China
[3] Hong Kong Polytech Univ, Dept Elect & Elect Engn, Kowloon, Hong Kong, Peoples R China
关键词
Point cloud compression; Three-dimensional displays; Transformers; Task analysis; Data models; Image coding; Knowledge transfer; Cross-modal learning; point cloud understanding; self-supervision; transfer learning;
D O I
10.1109/TMM.2024.3412330
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Self-supervised representation learning (SSRL) has gained increasing attention in point cloud understanding, in addressing the challenges posed by 3D data scarcity and high annotation costs. This paper presents PCExpert, a novel SSRL approach that reinterprets point clouds as "specialized images". This conceptual shift allows PCExpert to leverage knowledge derived from large-scale image modality in a more direct and deeper manner, via extensively sharing the parameters with a pre-trained image encoder in a multi-way Transformer architecture. The parameter sharing strategy, combined with an additional pretext task for pre-training, i.e., transformation estimation, empowers PCExpert to outperform the state of the arts in a variety of tasks, with a remarkable reduction in the number of trainable parameters. Notably, PCExpert's performance under LINEAR fine-tuning (e.g., yielding a 90.02% overall accuracy on ScanObjectNN) has already closely approximated the results obtained with FULL model fine-tuning (92.66%), demonstrating its effective representation capability.
引用
收藏
页码:10755 / 10765
页数:11
相关论文
共 50 条
  • [1] ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding
    Xue, Le
    Gao, Mingfei
    Xing, Chen
    Martin-Martin, Roberto
    Wu, Jiajun
    Xiong, Caiming
    Xu, Ran
    Niebles, Juan Carlos
    Savarese, Silvio
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1179 - 1189
  • [2] Alignment of 3D Point Clouds to Overhead Images
    Kaminsky, Ryan S.
    Snavely, Noah
    Seitz, Steven M.
    Szeliski, Richard
    2009 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPR WORKSHOPS 2009), VOLS 1 AND 2, 2009, : 449 - +
  • [3] A New Approach for 3D Edge Extraction by Fusing Point Clouds and Digital Images
    Wang, Ying
    Ewert, Daniel
    Schilberg, Daniel
    Jeschke, Sabina
    FRONTIERS OF MECHANICAL ENGINEERING AND MATERIALS ENGINEERING II, PTS 1 AND 2, 2014, 457-458 : 1012 - 1016
  • [4] Automatic alignment of 3D point clouds to orthographic images
    Xiong, Zi Ming
    Wan, Gang
    Cao, Xue Feng
    MANUFACTURING ENGINEERING AND AUTOMATION II, PTS 1-3, 2012, 591-593 : 1265 - 1268
  • [5] Acquiring Mechanical Knowledge from 3D Point Clouds
    Li, Zijia
    Okada, Kei
    Inaba, Masayuki
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 8065 - 8072
  • [6] Cross-modal knowledge transfer for 3D point clouds via graph offset
    Zhang, Huang
    Yu, Long
    Wang, Guoqi
    Tian, Shengwei
    Yu, Zaiyang
    Li, Weijun
    Ning, Xin
    PATTERN RECOGNITION, 2025, 162
  • [7] COLOR TRANSFER OF 3D POINT CLOUDS FOR XR APPLICATIONS
    Potechius, Herbert
    Sikora, Thomas
    Knorr, Sebastian
    2021 INTERNATIONAL CONFERENCE ON 3D IMMERSION (IC3D), 2021,
  • [8] Edge Detection in 3D Point Clouds Using Digital Images
    Dolapsaki, Maria Melina
    Georgopoulos, Andreas
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2021, 10 (04)
  • [9] Cross-modal knowledge transfer for 3D point clouds via graph offset prediction
    Zhang, Huang
    Yu, Long
    Wang, Guoqi
    Tian, Shengwei
    Yu, Zaiyang
    Li, Weijun
    Ning, Xin
    Pattern Recognition, 162
  • [10] Knowledge guided object detection and identification in 3D Point Clouds
    Karmacharya, A.
    Boochs, F.
    Tietz, B.
    VIDEOMETRICS, RANGE IMAGING, AND APPLICATIONS XIII, 2015, 9528