MVContrast: Unsupervised Pretraining for Multi-view 3D Object Recognition

被引:0
|
作者
Luequan Wang
Hongbin Xu
Wenxiong Kang
机构
[1] South China University of Technology,School of Automation Science and Engineering
来源
关键词
Multi view; unsupervised pretraining; contrastive learning; 3D vision; shape recognition;
D O I
暂无
中图分类号
学科分类号
摘要
3D shape recognition has drawn much attention in recent years. The view-based approach performs best of all. However, the current multi-view methods are almost all fully supervised, and the pretraining models are almost all based on ImageNet. Although the pretraining results of ImageNet are quite impressive, there is still a significant discrepancy between multi-view datasets and ImageNet. Multi-view datasets naturally retain rich 3D information. In addition, large-scale datasets such as ImageNet require considerable cleaning and annotation work, so it is difficult to regenerate a second dataset. In contrast, unsupervised learning methods can learn general feature representations without any extra annotation. To this end, we propose a three-stage unsupervised joint pretraining model. Specifically, we decouple the final representations into three fine-grained representations. Data augmentation is utilized to obtain pixel-level representations within each view. And we boost the spatial invariant features from the view level. Finally, we exploit global information at the shape level through a novel extract-and-swap module. Experimental results demonstrate that the proposed method gains significantly in 3D object classification and retrieval tasks, and shows generalization to cross-dataset tasks.
引用
收藏
页码:872 / 883
页数:11
相关论文
共 50 条
  • [21] CFMVOR: Federated Multi-view 3D Object Recognition Based on Compressed Learning
    Xiao, Di
    Zhang, Meng
    Zhang, Maolan
    Chen, Lvjun
    PATTERN RECOGNITION AND COMPUTER VISION, PT XIII, PRCV 2024, 2025, 15043 : 280 - 293
  • [22] 3D object recognition based on pairwise Multi-view Convolutional Neural Networks
    Gao, Z.
    Wang, D. Y.
    Xue, Y. B.
    Xu, G. P.
    Zhang, H.
    Wang, Y. L.
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 56 : 305 - 315
  • [23] Review of multi-view 3D object recognition methods based on deep learning
    Qi, Shaohua
    Ning, Xin
    Yang, Guowei
    Zhang, Liping
    Long, Peng
    Cai, Weiwei
    Li, Weijun
    DISPLAYS, 2021, 69
  • [24] iMVS: Integrating multi-view information on multiple scales for 3D object recognition ☆
    Jiang, Jiaqin
    Liu, Zhao
    Li, Jie
    Tu, Jingmin
    Li, Li
    Yao, Jian
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 100
  • [25] A Progressive Multi-View Learning Approach for Multi-Loss Optimization in 3D Object Recognition
    Prasad, Shitala
    Li, Yiqun
    Lin, Dongyun
    Dong, Sheng
    Nwe, Ma Tin Lay
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 707 - 711
  • [26] A Multi-View Probabilistic Model for 3D Object Classes
    Sun, Min
    Su, Hao
    Savarese, Silvio
    Li Fei-Fei
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 1247 - +
  • [27] A Compact Multi-View Descriptor for 3D Object Retrieval
    Daras, Petros
    Axenopoulos, Apostolos
    CBMI: 2009 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2009, : 115 - 119
  • [28] A Progressive Multi-View Learning Approach for Multi-Loss Optimization in 3D Object Recognition
    Prasad, Shitala
    Li, Yiqun
    Lin, Dongyun
    Dong, Sheng
    Nwe, Ma Tin Lay
    IEEE Signal Processing Letters, 2022, 29 : 707 - 711
  • [29] Dynamic View Aggregation for Multi-View 3D Shape Recognition
    Zhou, Yuan
    Sun, Zhongqi
    Huo, Shuwei
    Kung, Sun-Yuan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9163 - 9174
  • [30] Multi-view representation and synthesis for 3D object movie
    Lie, WN
    Wei, BE
    2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL II, PROCEEDINGS, 2002, : 529 - 532