A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

被引：138

作者：

Feichtenhofer, Christoph ^{[1
]}

Fan, Haoqi ^{[1
]}

Xiong, Bo ^{[1
]}

Girshick, Ross ^{[1
]}

He, Kaiming ^{[1
]}

机构：

[1] Facebook AI Res FAIR, Paris, France

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

关键词：

D O I：

10.1109/CVPR46437.2021.00331

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a large-scale study on unsupervised spatiotemporal representation learning from videos. With a unified perspective on four recent image-based frameworks, we study a simple objective that can easily generalize all these methods to space-time. Our objective encourages temporally-persistent features in the same video, and in spite of its simplicity, it works surprisingly well across: (i) different unsupervised frameworks, (ii) pre-training datasets, (iii) downstream datasets, and (iv) backbone architectures. We draw a series of intriguing observations from this study, e.g., we discover that encouraging long-spanned persistency can be effective even if the timespan is 60 seconds. In addition to state-of-the-art results in multiple benchmarks, we report a few promising cases in which unsupervised pre-training can outperform its supervised counterpart.

引用

页码：3298 / 3308

页数：11

共 50 条

[11] Large-scale knowledge graph representation learning
Badrouni, Marwa
Katar, Chaker
Inoubli, Wissem
KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (09) : 5479 - 5499
[12] MMSVC: An Efficient Unsupervised Learning Approach for Large-Scale Datasets
Gu, Hong
Zhao, Guangzhou
Zhang, Jianliang
LIFE SYSTEM MODELING AND INTELLIGENT COMPUTING, 2010, 6330 : 1 - 9
[13] MMSVC: An efficient unsupervised learning approach for large-scale datasets
Gu, Hong
Zhao, Guangzhou
Zhang, Jianliang
NEUROCOMPUTING, 2012, 98 : 114 - 122
[14] Efficient Large-Scale Visual Representation Learning and Evaluation
Dolev, Eden
Awad, Alaa
Roberts, Denisa Olteanu
Ebrahimzadeh, Zahra
Mejran, Marcin
Malpani, Vaibhav
Yavuz, Mahir
REVOLUTIONIZING FASHION AND RETAIL, 2025, 1299 : 97 - 111
[15] Dynamic Representation Learning for Large-Scale Attributed Networks
Liu, Zhijun
Huang, Chao
Yu, Yanwei
Song, Peng
Fan, Baode
Dong, Junyu
CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1005 - 1014
[16] Prototype Memory for Large-Scale Face Representation Learning
Smirnov, Evgeny
Garaev, Nikita
Galyuk, Vasiliy
Lukyanets, Evgeny
IEEE ACCESS, 2022, 10 : 12031 - 12046
[17] An Unsupervised Learning Network for Large-Scale LiDAR Point Clouds Registration
Liu, Jingbin
Lv, Xuanfan
Gong, Xiaodong
Liang, Yifan
Hyyppa, Juha
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (11) : 16187 - 16200
[18] Supervised and Unsupervised Parallel Subspace Learning for Large-Scale Image Recognition
Jing, Xiao-Yuan
Li, Sheng
Zhang, David
Yang, Jian
Yang, Jing-Yu
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2012, 22 (10) : 1497 - 1511
[19] Large-Scale Unsupervised Semantic Segmentation
Gao, Shanghua
Li, Zhong-Yu
Yang, Ming-Hsuan
Cheng, Ming-Ming
Han, Junwei
Torr, Philip
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7457 - 7476
[20] Large-Scale Unsupervised Object Discovery
Vo, Huy V.
Sizikova, Elena
Schmid, Cordelia
Perez, Patrick
Ponce, Jean
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34

← 1 2 3 4 5 →