A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

被引:138
|
作者
Feichtenhofer, Christoph [1 ]
Fan, Haoqi [1 ]
Xiong, Bo [1 ]
Girshick, Ross [1 ]
He, Kaiming [1 ]
机构
[1] Facebook AI Res FAIR, Paris, France
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
关键词
D O I
10.1109/CVPR46437.2021.00331
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a large-scale study on unsupervised spatiotemporal representation learning from videos. With a unified perspective on four recent image-based frameworks, we study a simple objective that can easily generalize all these methods to space-time. Our objective encourages temporally-persistent features in the same video, and in spite of its simplicity, it works surprisingly well across: (i) different unsupervised frameworks, (ii) pre-training datasets, (iii) downstream datasets, and (iv) backbone architectures. We draw a series of intriguing observations from this study, e.g., we discover that encouraging long-spanned persistency can be effective even if the timespan is 60 seconds. In addition to state-of-the-art results in multiple benchmarks, we report a few promising cases in which unsupervised pre-training can outperform its supervised counterpart.
引用
收藏
页码:3298 / 3308
页数:11
相关论文
共 50 条
  • [1] A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition
    Deng, Andong
    Yang, Taojiannan
    Chen, Chen
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20462 - 20474
  • [2] Adversarial Caching Training: Unsupervised Inductive Network Representation Learning on Large-Scale Graphs
    Chen, Junyang
    Gong, Zhiguo
    Wang, Wei
    Wang, Cong
    Xu, Zhenghua
    Lv, Jianming
    Li, Xueliang
    Wu, Kaishun
    Liu, Weiwen
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7079 - 7090
  • [3] Large-scale Graph Representation Learning
    Leskovec, Jure
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 4 - 4
  • [4] Unsupervised Representation Learning for Large-Scale Wafer Maps in Micro-Electronic Manufacturing
    Xu, Qiao
    Yu, Naigong
    Yu, Hejie
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 1226 - 1235
  • [5] Large-Scale Unsupervised Hashing with Shared Structure Learning
    Liu, Xianglong
    Mu, Yadong
    Zhang, Danchen
    Lang, Bo
    Li, Xuelong
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (09) : 1811 - 1822
  • [6] Unsupervised learning for large-scale corneal topography clustering
    Zeboulon, Pierre
    Debellemaniere, Guillaume
    Gatinel, Damien
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [7] Unsupervised learning for large-scale corneal topography clustering
    Pierre Zéboulon
    Guillaume Debellemanière
    Damien Gatinel
    Scientific Reports, 10
  • [8] Representation Learning for Large-Scale Dynamic Networks
    Yu, Yanwei
    Yao, Huaxiu
    Wang, Hongjian
    Tang, Xianfeng
    Li, Zhenhui
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2018), PT II, 2018, 10828 : 526 - 541
  • [9] Learning Deep Representation with Large-scale Attributes
    Ouyang, Wanli
    Li, Hongyang
    Zeng, Xingyu
    Wang, Xiaogang
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1895 - 1903
  • [10] Large-scale modeling of wordform learning and representation
    Sibley, Daragh E.
    Kello, Christopher T.
    Plaut, David C.
    Elman, Jeffrey L.
    COGNITIVE SCIENCE, 2008, 32 (04) : 741 - 754