MVContrast: Unsupervised Pretraining for Multi-view 3D Object Recognition

被引：0

作者：

Luequan Wang

Hongbin Xu

Wenxiong Kang

机构：

[1] South China University of Technology,School of Automation Science and Engineering

来源：

Machine Intelligence Research | 2023年 / 20卷

关键词：

Multi view; unsupervised pretraining; contrastive learning; 3D vision; shape recognition;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

3D shape recognition has drawn much attention in recent years. The view-based approach performs best of all. However, the current multi-view methods are almost all fully supervised, and the pretraining models are almost all based on ImageNet. Although the pretraining results of ImageNet are quite impressive, there is still a significant discrepancy between multi-view datasets and ImageNet. Multi-view datasets naturally retain rich 3D information. In addition, large-scale datasets such as ImageNet require considerable cleaning and annotation work, so it is difficult to regenerate a second dataset. In contrast, unsupervised learning methods can learn general feature representations without any extra annotation. To this end, we propose a three-stage unsupervised joint pretraining model. Specifically, we decouple the final representations into three fine-grained representations. Data augmentation is utilized to obtain pixel-level representations within each view. And we boost the spatial invariant features from the view level. Finally, we exploit global information at the shape level through a novel extract-and-swap module. Experimental results demonstrate that the proposed method gains significantly in 3D object classification and retrieval tasks, and shows generalization to cross-dataset tasks.

引用

页码：872 / 883

页数：11

共 50 条

[41] Multi-View 3D Object Detection Network for Autonomous Driving
Chen, Xiaozhi
Ma, Huimin
Wan, Ji
Li, Bo
Xia, Tian
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6526 - 6534
[42] 3D Object Localisation from Multi-View Image Detections
Rubino, Cosimo
Crocco, Marco
Del Bue, Alessio
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (06) : 1281 - 1294
[43] Multi-view and multivariate gaussian descriptor for 3D object retrieval
Gao, Zan
Xue, Kai-Xin
Zhang, Hua
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (01) : 555 - 572
[44] Multi-view and multivariate gaussian descriptor for 3D object retrieval
Zan Gao
Kai-Xin Xue
Hua Zhang
Multimedia Tools and Applications, 2019, 78 : 555 - 572
[45] Multi-View Object Class Detection with a 3D Geometric Model
Liebelt, Joerg
Schmid, Cordelia
2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 1688 - 1695
[46] Hierarchical Graph Attention Based Multi-View Convolutional Neural Network for 3D Object Recognition
Zeng, Hui
Zhao, Tianmeng
Cheng, Ruting
Wang, Fuzhou
Liu, Jiwei
IEEE ACCESS, 2021, 9 (09): : 33323 - 33335
[47] CAPE: Camera View Position Embedding for Multi-View 3D Object Detection
Xiong, Kaixin
Gong, Shi
Ye, Xiaoqing
Tan, Xiao
Wan, Ji
Ding, Errui
Wang, Jingdong
Bai, Xiang
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21570 - 21579
[48] MapReduce for Multi-view Object Recognition
Noor, Shaheena
Uddin, Vali
2016 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS 2016), 2016, : 575 - 582
[49] Unsupervised Multi-View CNN for Salient View Selection and 3D Interest Point Detection
Ran Song
Wei Zhang
Yitian Zhao
Yonghuai Liu
International Journal of Computer Vision, 2022, 130 : 1210 - 1227
[50] Unsupervised Multi-View CNN for Salient View Selection and 3D Interest Point Detection
Song, Ran
Zhang, Wei
Zhao, Yitian
Liu, Yonghuai
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (05) : 1210 - 1227

← 1 2 3 4 5 →