MVContrast: Unsupervised Pretraining for Multi-view 3D Object Recognition

被引：0

作者：

Luequan Wang

Hongbin Xu

Wenxiong Kang

机构：

[1] South China University of Technology,School of Automation Science and Engineering

来源：

Machine Intelligence Research | 2023年 / 20卷

关键词：

Multi view; unsupervised pretraining; contrastive learning; 3D vision; shape recognition;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

3D shape recognition has drawn much attention in recent years. The view-based approach performs best of all. However, the current multi-view methods are almost all fully supervised, and the pretraining models are almost all based on ImageNet. Although the pretraining results of ImageNet are quite impressive, there is still a significant discrepancy between multi-view datasets and ImageNet. Multi-view datasets naturally retain rich 3D information. In addition, large-scale datasets such as ImageNet require considerable cleaning and annotation work, so it is difficult to regenerate a second dataset. In contrast, unsupervised learning methods can learn general feature representations without any extra annotation. To this end, we propose a three-stage unsupervised joint pretraining model. Specifically, we decouple the final representations into three fine-grained representations. Data augmentation is utilized to obtain pixel-level representations within each view. And we boost the spatial invariant features from the view level. Finally, we exploit global information at the shape level through a novel extract-and-swap module. Experimental results demonstrate that the proposed method gains significantly in 3D object classification and retrieval tasks, and shows generalization to cross-dataset tasks.

引用

页码：872 / 883

页数：11

共 50 条

[1] MVContrast: Unsupervised Pretraining for Multi-view 3D Object Recognition
Wang, Luequan
Xu, Hongbin
Kang, Wenxiong
MACHINE INTELLIGENCE RESEARCH, 2023, 20 (06) : 872 - 883
[2] Learning Relationships for Multi-View 3D Object Recognition
Yang, Ze
Wang, Liwei
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7504 - 7513
[3] 3D LayoutCRF for multi-view object class recognition and segmentation
Hoiem, Derek
Rother, Carsten
Winn, John
2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 580 - +
[4] Learning Disentangled Representation for Multi-View 3D Object Recognition
Huang, Jingjia
Yan, Wei
Li, Ge
Li, Thomas
Liu, Shan
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (02) : 646 - 659
[5] Multi-view convolutional vision transformer for 3D object recognition
Li, Jie
Liu, Zhao
Li, Li
Lin, Junqin
Yao, Jian
Tu, Jingmin
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 95
[6] Multi-view ensemble manifold regularization for 3D object recognition
Hong, Chaoqun
Yu, Jun
You, Jane
Chen, Xuhui
Tao, Dapeng
INFORMATION SCIENCES, 2015, 320 : 395 - 405
[7] Multi-view dual attention network for 3D object recognition
Wenju Wang
Yu Cai
Tao Wang
Neural Computing and Applications, 2022, 34 : 3201 - 3212
[8] Deep models for multi-view 3D object recognition: a review
Alzahrani, Mona
Usman, Muhammad
Jarraya, Salma Kammoun
Anwar, Saeed
Helmy, Tarek
ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (12)
[9] Multi-view dual attention network for 3D object recognition
Wang, Wenju
Cai, Yu
Wang, Tao
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (04): : 3201 - 3212
[10] Multi-view Harmonized Bilinear Network for 3D Object Recognition
Yu, Tan
Meng, Jingjing
Yuan, Junsong
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 186 - 194

← 1 2 3 4 5 →