MVContrast: Unsupervised Pretraining for Multi-view 3D Object Recognition

被引:0
|
作者
Luequan Wang
Hongbin Xu
Wenxiong Kang
机构
[1] South China University of Technology,School of Automation Science and Engineering
来源
关键词
Multi view; unsupervised pretraining; contrastive learning; 3D vision; shape recognition;
D O I
暂无
中图分类号
学科分类号
摘要
3D shape recognition has drawn much attention in recent years. The view-based approach performs best of all. However, the current multi-view methods are almost all fully supervised, and the pretraining models are almost all based on ImageNet. Although the pretraining results of ImageNet are quite impressive, there is still a significant discrepancy between multi-view datasets and ImageNet. Multi-view datasets naturally retain rich 3D information. In addition, large-scale datasets such as ImageNet require considerable cleaning and annotation work, so it is difficult to regenerate a second dataset. In contrast, unsupervised learning methods can learn general feature representations without any extra annotation. To this end, we propose a three-stage unsupervised joint pretraining model. Specifically, we decouple the final representations into three fine-grained representations. Data augmentation is utilized to obtain pixel-level representations within each view. And we boost the spatial invariant features from the view level. Finally, we exploit global information at the shape level through a novel extract-and-swap module. Experimental results demonstrate that the proposed method gains significantly in 3D object classification and retrieval tasks, and shows generalization to cross-dataset tasks.
引用
收藏
页码:872 / 883
页数:11
相关论文
共 50 条
  • [31] Viewpoint Equivariance for Multi-View 3D Object Detection
    Chen, Dian
    Li, Jie
    Guizilini, Vitor
    Ambrus, Rares
    Gaidon, Adrien
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9213 - 9222
  • [32] Unsupervised 3D reconstruction method based on multi-view propagation
    Luo J.
    Yuan D.
    Zhang L.
    Qu Y.
    Su S.
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2024, 42 (01): : 129 - 137
  • [33] Drcnn: Dynamic routing convolutional neural network for multi-view 3d object recognition
    Sun, Kai
    Zhang, Jiangshe
    Liu, Junmin
    Yu, Ruixuan
    Song, Zengjie
    IEEE Transactions on Image Processing, 2021, 30 : 868 - 877
  • [34] DRCNN: Dynamic Routing Convolutional Neural Network for Multi-View 3D Object Recognition
    Sun, Kai
    Zhang, Jiangshe
    Liu, Junmin
    Yu, Ruixuan
    Song, Zengjie
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 868 - 877
  • [35] Object-based encoding for multi-view sequences of 3D object
    Yi, J
    Rhee, K
    Kim, S
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2002, 17 (03) : 293 - 304
  • [36] Multi-view Manhole Detection, Recognition, and 3D Localisation
    Timofte, Radu
    Van Gool, Luc
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [37] Unsupervised Multi-view Object Proposal Ranking
    Man, Hong
    Dai, Shuanglu
    Lawrence, Victor
    LaPeruta, Thomas
    Hohil, Myron
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS III, 2021, 11746
  • [38] Unsupervised feature selection via distributed coding for multi-view object recognition
    Christoudias, C. Mario
    Urtasun, Raquel
    Darrell, Trevor
    2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 2126 - +
  • [39] Volumetric and Multi-View CNNs for Object Classification on 3D Data
    Qi, Charles R.
    Su, Hao
    Niessner, Matthias
    Dai, Angela
    Yan, Mengyuan
    Guibas, Leonidas J.
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5648 - 5656
  • [40] Multi-View 3D Object Retrieval With Deep Embedding Network
    Guo, Haiyun
    Wang, Jinqiao
    Gao, Yue
    Li, Jianqiang
    Lu, Hanqing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (12) : 5526 - 5537