Self-supervised Video Representation Learning with Cascade Positive Retrieval

被引：1

作者：

Wu, Cheng-En ^{[1
]}

Lai, Farley ^{[2
]}

Hu, Yu Hen ^{[1
]}

Kadav, Asim ^{[2
]}

机构：

[1] Univ Wisconsin Madison, Dept Elect & Comp Engn, Madison, WI 53706 USA

[2] NEC Labs Amer Inc, San Jose, CA USA

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022 | 2022年

关键词：

D O I：

10.1109/CVPRW56347.2022.00452

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Self-supervised video representation learning has been shown to effectively improve downstream tasks such as video retrieval and action recognition. In this paper, we present the Cascade Positive Retrieval (CPR) that successively mines positive examples w.r.t. the query for contrastive learning in a cascade of stages. Specifically, CPR exploits multiple views of a query example in different modalities, where an alternative view may help find another positive example dissimilar in the query view. We explore the effects of possible CPR configurations in ablations including the number of mining stages, the top similar example selection ratio in each stage, and progressive training with an incremental number of the final Top-k selection. The overall mining quality is measured to reflect the recall across training set classes. CPR reaches a median class mining recall of 83.3%, outperforming previous work by 5.5%. Implementation-wise, CPR is complementary to pretext tasks and can be easily applied to previous work. In the evaluation of pretraining on UCF101, CPR consistently improves existing work and even achieves state-of-the-art R@1 of 56.7% and 24.4% in video retrieval as well as 83.8% and 54.8% in action recognition on UCF101 and HMDB51.

引用

页码：4079 / 4088

页数：10

共 50 条

[41] Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning
Gan, Chuang
Gong, Boqing
Liu, Kun
Su, Hao
Guibas, Leonidas J.
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5589 - 5597
[42] Self-Supervised Video Representation Learning by Uncovering Spatio-Temporal Statistics
Wang, Jiangliu
Jiao, Jianbo
Bao, Linchao
He, Shengfeng
Liu, Wei
Liu, Yun-hui
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3791 - 3806
[43] Cut-in maneuver detection with self-supervised contrastive video representation learning
Nalcakan, Yagiz
Bastanlar, Yalin
[J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (06) : 2915 - 2923
[44] Cross-View Temporal Contrastive Learning for Self-Supervised Video Representation
Wang, Lulu
Xu, Zengmin
Zhang, Xuelian
Meng, Ruxing
Lu, Tao
[J]. Computer Engineering and Applications, 60 (18): : 158 - 166
[45] Self-Supervised Multi-Label Transformation Prediction for Video Representation Learning
Assefa, Maregu
Jiang, Wei
Yilma, Getinet
Kumeda, Bulbula
Ayalew, Melese
Seid, Mohammed
[J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2022, 31 (09)
[46] Attentive spatial-temporal contrastive learning for self-supervised video representation
Yang, Xingming
Xiong, Sixuan
Wu, Kewei
Shan, Dongfeng
Xie, Zhao
[J]. IMAGE AND VISION COMPUTING, 2023, 137
[47] GOCA: Guided Online Cluster Assignment for Self-supervised Video Representation Learning
Coskun, Huseyin
Zareian, Alireza
Moore, Joshua L.
Tombari, Federico
Wang, Chen
[J]. COMPUTER VISION, ECCV 2022, PT XXXI, 2022, 13691 : 1 - 22
[48] Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
Toering, Martine
Gatopoulos, Ioannis
Stol, Maarten
Hu, Vincent Tao
[J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 846 - 856
[49] Cut-in maneuver detection with self-supervised contrastive video representation learning
Yagiz Nalcakan
Yalin Bastanlar
[J]. Signal, Image and Video Processing, 2023, 17 : 2915 - 2923
[50] Contrastive Spatio-Temporal Pretext Learning for Self-Supervised Video Representation
Zhang, Yujia
Po, Lai-Man
Xu, Xuyuan
Liu, Mengyang
Wang, Yexin
Ou, Weifeng
Zhao, Yuzhi
Yu, Wing-Yin
[J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3380 - 3389

← 1 2 3 4 5 →