A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

被引:138
|
作者
Feichtenhofer, Christoph [1 ]
Fan, Haoqi [1 ]
Xiong, Bo [1 ]
Girshick, Ross [1 ]
He, Kaiming [1 ]
机构
[1] Facebook AI Res FAIR, Paris, France
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
关键词
D O I
10.1109/CVPR46437.2021.00331
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a large-scale study on unsupervised spatiotemporal representation learning from videos. With a unified perspective on four recent image-based frameworks, we study a simple objective that can easily generalize all these methods to space-time. Our objective encourages temporally-persistent features in the same video, and in spite of its simplicity, it works surprisingly well across: (i) different unsupervised frameworks, (ii) pre-training datasets, (iii) downstream datasets, and (iv) backbone architectures. We draw a series of intriguing observations from this study, e.g., we discover that encouraging long-spanned persistency can be effective even if the timespan is 60 seconds. In addition to state-of-the-art results in multiple benchmarks, we report a few promising cases in which unsupervised pre-training can outperform its supervised counterpart.
引用
收藏
页码:3298 / 3308
页数:11
相关论文
共 50 条
  • [11] Large-scale knowledge graph representation learning
    Badrouni, Marwa
    Katar, Chaker
    Inoubli, Wissem
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (09) : 5479 - 5499
  • [12] MMSVC: An Efficient Unsupervised Learning Approach for Large-Scale Datasets
    Gu, Hong
    Zhao, Guangzhou
    Zhang, Jianliang
    LIFE SYSTEM MODELING AND INTELLIGENT COMPUTING, 2010, 6330 : 1 - 9
  • [13] MMSVC: An efficient unsupervised learning approach for large-scale datasets
    Gu, Hong
    Zhao, Guangzhou
    Zhang, Jianliang
    NEUROCOMPUTING, 2012, 98 : 114 - 122
  • [14] Efficient Large-Scale Visual Representation Learning and Evaluation
    Dolev, Eden
    Awad, Alaa
    Roberts, Denisa Olteanu
    Ebrahimzadeh, Zahra
    Mejran, Marcin
    Malpani, Vaibhav
    Yavuz, Mahir
    REVOLUTIONIZING FASHION AND RETAIL, 2025, 1299 : 97 - 111
  • [15] Dynamic Representation Learning for Large-Scale Attributed Networks
    Liu, Zhijun
    Huang, Chao
    Yu, Yanwei
    Song, Peng
    Fan, Baode
    Dong, Junyu
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1005 - 1014
  • [16] Prototype Memory for Large-Scale Face Representation Learning
    Smirnov, Evgeny
    Garaev, Nikita
    Galyuk, Vasiliy
    Lukyanets, Evgeny
    IEEE ACCESS, 2022, 10 : 12031 - 12046
  • [17] An Unsupervised Learning Network for Large-Scale LiDAR Point Clouds Registration
    Liu, Jingbin
    Lv, Xuanfan
    Gong, Xiaodong
    Liang, Yifan
    Hyyppa, Juha
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (11) : 16187 - 16200
  • [18] Supervised and Unsupervised Parallel Subspace Learning for Large-Scale Image Recognition
    Jing, Xiao-Yuan
    Li, Sheng
    Zhang, David
    Yang, Jian
    Yang, Jing-Yu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2012, 22 (10) : 1497 - 1511
  • [19] Large-Scale Unsupervised Semantic Segmentation
    Gao, Shanghua
    Li, Zhong-Yu
    Yang, Ming-Hsuan
    Cheng, Ming-Ming
    Han, Junwei
    Torr, Philip
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7457 - 7476
  • [20] Large-Scale Unsupervised Object Discovery
    Vo, Huy V.
    Sizikova, Elena
    Schmid, Cordelia
    Perez, Patrick
    Ponce, Jean
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34