MVContrast: Unsupervised Pretraining for Multi-view 3D Object Recognition

被引:0
|
作者
Luequan Wang
Hongbin Xu
Wenxiong Kang
机构
[1] South China University of Technology,School of Automation Science and Engineering
来源
关键词
Multi view; unsupervised pretraining; contrastive learning; 3D vision; shape recognition;
D O I
暂无
中图分类号
学科分类号
摘要
3D shape recognition has drawn much attention in recent years. The view-based approach performs best of all. However, the current multi-view methods are almost all fully supervised, and the pretraining models are almost all based on ImageNet. Although the pretraining results of ImageNet are quite impressive, there is still a significant discrepancy between multi-view datasets and ImageNet. Multi-view datasets naturally retain rich 3D information. In addition, large-scale datasets such as ImageNet require considerable cleaning and annotation work, so it is difficult to regenerate a second dataset. In contrast, unsupervised learning methods can learn general feature representations without any extra annotation. To this end, we propose a three-stage unsupervised joint pretraining model. Specifically, we decouple the final representations into three fine-grained representations. Data augmentation is utilized to obtain pixel-level representations within each view. And we boost the spatial invariant features from the view level. Finally, we exploit global information at the shape level through a novel extract-and-swap module. Experimental results demonstrate that the proposed method gains significantly in 3D object classification and retrieval tasks, and shows generalization to cross-dataset tasks.
引用
收藏
页码:872 / 883
页数:11
相关论文
共 50 条
  • [41] Multi-View 3D Object Detection Network for Autonomous Driving
    Chen, Xiaozhi
    Ma, Huimin
    Wan, Ji
    Li, Bo
    Xia, Tian
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6526 - 6534
  • [42] 3D Object Localisation from Multi-View Image Detections
    Rubino, Cosimo
    Crocco, Marco
    Del Bue, Alessio
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (06) : 1281 - 1294
  • [43] Multi-view and multivariate gaussian descriptor for 3D object retrieval
    Gao, Zan
    Xue, Kai-Xin
    Zhang, Hua
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (01) : 555 - 572
  • [44] Multi-view and multivariate gaussian descriptor for 3D object retrieval
    Zan Gao
    Kai-Xin Xue
    Hua Zhang
    Multimedia Tools and Applications, 2019, 78 : 555 - 572
  • [45] Multi-View Object Class Detection with a 3D Geometric Model
    Liebelt, Joerg
    Schmid, Cordelia
    2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 1688 - 1695
  • [46] Hierarchical Graph Attention Based Multi-View Convolutional Neural Network for 3D Object Recognition
    Zeng, Hui
    Zhao, Tianmeng
    Cheng, Ruting
    Wang, Fuzhou
    Liu, Jiwei
    IEEE ACCESS, 2021, 9 (09): : 33323 - 33335
  • [47] CAPE: Camera View Position Embedding for Multi-View 3D Object Detection
    Xiong, Kaixin
    Gong, Shi
    Ye, Xiaoqing
    Tan, Xiao
    Wan, Ji
    Ding, Errui
    Wang, Jingdong
    Bai, Xiang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21570 - 21579
  • [48] MapReduce for Multi-view Object Recognition
    Noor, Shaheena
    Uddin, Vali
    2016 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS 2016), 2016, : 575 - 582
  • [49] Unsupervised Multi-View CNN for Salient View Selection and 3D Interest Point Detection
    Ran Song
    Wei Zhang
    Yitian Zhao
    Yonghuai Liu
    International Journal of Computer Vision, 2022, 130 : 1210 - 1227
  • [50] Unsupervised Multi-View CNN for Salient View Selection and 3D Interest Point Detection
    Song, Ran
    Zhang, Wei
    Zhao, Yitian
    Liu, Yonghuai
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (05) : 1210 - 1227