MVContrast: Unsupervised Pretraining for Multi-view 3D Object Recognition

被引：0

作者：

Luequan Wang

Hongbin Xu

Wenxiong Kang

机构：

[1] South China University of Technology,School of Automation Science and Engineering

来源：

Machine Intelligence Research | 2023年 / 20卷

关键词：

Multi view; unsupervised pretraining; contrastive learning; 3D vision; shape recognition;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

3D shape recognition has drawn much attention in recent years. The view-based approach performs best of all. However, the current multi-view methods are almost all fully supervised, and the pretraining models are almost all based on ImageNet. Although the pretraining results of ImageNet are quite impressive, there is still a significant discrepancy between multi-view datasets and ImageNet. Multi-view datasets naturally retain rich 3D information. In addition, large-scale datasets such as ImageNet require considerable cleaning and annotation work, so it is difficult to regenerate a second dataset. In contrast, unsupervised learning methods can learn general feature representations without any extra annotation. To this end, we propose a three-stage unsupervised joint pretraining model. Specifically, we decouple the final representations into three fine-grained representations. Data augmentation is utilized to obtain pixel-level representations within each view. And we boost the spatial invariant features from the view level. Finally, we exploit global information at the shape level through a novel extract-and-swap module. Experimental results demonstrate that the proposed method gains significantly in 3D object classification and retrieval tasks, and shows generalization to cross-dataset tasks.

引用

页码：872 / 883

页数：11

共 50 条

[21] CFMVOR: Federated Multi-view 3D Object Recognition Based on Compressed Learning
Xiao, Di
Zhang, Meng
Zhang, Maolan
Chen, Lvjun
PATTERN RECOGNITION AND COMPUTER VISION, PT XIII, PRCV 2024, 2025, 15043 : 280 - 293
[22] 3D object recognition based on pairwise Multi-view Convolutional Neural Networks
Gao, Z.
Wang, D. Y.
Xue, Y. B.
Xu, G. P.
Zhang, H.
Wang, Y. L.
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 56 : 305 - 315
[23] Review of multi-view 3D object recognition methods based on deep learning
Qi, Shaohua
Ning, Xin
Yang, Guowei
Zhang, Liping
Long, Peng
Cai, Weiwei
Li, Weijun
DISPLAYS, 2021, 69
[24] iMVS: Integrating multi-view information on multiple scales for 3D object recognition ☆
Jiang, Jiaqin
Liu, Zhao
Li, Jie
Tu, Jingmin
Li, Li
Yao, Jian
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 100
[25] A Progressive Multi-View Learning Approach for Multi-Loss Optimization in 3D Object Recognition
Prasad, Shitala
Li, Yiqun
Lin, Dongyun
Dong, Sheng
Nwe, Ma Tin Lay
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 707 - 711
[26] A Multi-View Probabilistic Model for 3D Object Classes
Sun, Min
Su, Hao
Savarese, Silvio
Li Fei-Fei
CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 1247 - +
[27] A Compact Multi-View Descriptor for 3D Object Retrieval
Daras, Petros
Axenopoulos, Apostolos
CBMI: 2009 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2009, : 115 - 119
[28] A Progressive Multi-View Learning Approach for Multi-Loss Optimization in 3D Object Recognition
Prasad, Shitala
Li, Yiqun
Lin, Dongyun
Dong, Sheng
Nwe, Ma Tin Lay
IEEE Signal Processing Letters, 2022, 29 : 707 - 711
[29] Dynamic View Aggregation for Multi-View 3D Shape Recognition
Zhou, Yuan
Sun, Zhongqi
Huo, Shuwei
Kung, Sun-Yuan
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9163 - 9174
[30] Multi-view representation and synthesis for 3D object movie
Lie, WN
Wei, BE
2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL II, PROCEEDINGS, 2002, : 529 - 532

← 1 2 3 4 5 →