Unsupervised Pre-training for Temporal Action Localization Tasks

被引:9
|
作者
Zhang, Can [1 ]
Yang, Tianyu [2 ]
Weng, Junwu [2 ]
Cao, Meng [1 ]
Wang, Jue [2 ]
Zou, Yuexian [1 ]
机构
[1] Peking Univ, Sch Elect & Comp Engn, Beijing, Peoples R China
[2] Tencent AI Lab, Bellevue, WA USA
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52688.2022.01364
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unsupervised video representation learning has made remarkable achievements in recent years. However, most existing methods are designed and optimized for video classification. These pre-trained models can be sub-optimal for temporal localization tasks due to the inherent discrepancy between video-level classification and clip-level localization. To bridge this gap, we make the first attempt to propose a self-supervised pretext task, coined as Pseudo Action Localization (PAL) to Unsupervisedly Pre-train feature encoders for Temporal Action Localization tasks (UP-TAL). Specifically, we first randomly select temporal regions, each of which contains multiple clips, from one video as pseudo actions and then paste them onto different temporal positions of the other two videos. The pretext task is to align the features of pasted pseudo action regions from two synthetic videos and maximize the agreement between them. Compared to the existing unsupervised video representation learning approaches, our PAL adapts better to downstream TAL tasks by introducing a temporal equivariant contrastive learning paradigm in a temporally dense and scale-aware manner. Extensive experiments show that PAL can utilize large-scale unlabeled video data to significantly boost the performance of existing TAL methods. Our codes and models will be made publicly available at https://github.com/zhang-can/UP-TAL.
引用
收藏
页码:14011 / 14021
页数:11
相关论文
共 50 条
  • [1] Boundary-sensitive Pre-training for Temporal Localization in Videos
    Xu, Mengmeng
    Perez-Rua, Juan-Manuel
    Escorcia, Victor
    Martinez, Brais
    Zhu, Xiatian
    Zhang, Li
    Ghanem, Bernard
    Xiang, Tao
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7200 - 7210
  • [2] Unsupervised Pre-Training for Detection Transformers
    Dai, Zhigang
    Cai, Bolun
    Lin, Yugeng
    Chen, Junying
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 12772 - 12782
  • [3] LocVTP: Video-Text Pre-training for Temporal Localization
    Cao, Meng
    Yang, Tianyu
    Weng, Junwu
    Zhang, Can
    Wang, Jue
    Zou, Yuexian
    [J]. COMPUTER VISION, ECCV 2022, PT XXVI, 2022, 13686 : 38 - 56
  • [4] Unsupervised Pre-Training for Voice Activation
    Kolesau, Aliaksei
    Sesok, Dmitrij
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (23): : 1 - 13
  • [5] Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
    Ju, Chen
    Zheng, Kunhao
    Liu, Jinxiang
    Zhao, Peisen
    Zhang, Ya
    Chang, Jianlong
    Tian, Qi
    Wang, Yanfeng
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14751 - 14762
  • [6] Improving fault localization with pre-training
    Zhang, Zhuo
    Li, Ya
    Xue, Jianxin
    Mao, Xiaoguang
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (01)
  • [7] Exploring complementary information of self-supervised pretext tasks for unsupervised video pre-training
    Zhou, Wei
    Hou, Yi
    Ouyang, Kewei
    Zhou, Shilin
    [J]. IET COMPUTER VISION, 2022, 16 (03) : 255 - 265
  • [8] Improving fault localization with pre-training
    Zhuo Zhang
    Ya Li
    Jianxin Xue
    Xiaoguang Mao
    [J]. Frontiers of Computer Science, 2024, 18
  • [9] Neural speech enhancement with unsupervised pre-training and mixture training
    Hao, Xiang
    Xu, Chenglin
    Xie, Lei
    [J]. NEURAL NETWORKS, 2023, 158 : 216 - 227
  • [10] Unsupervised Pre-training for Fully Convolutional Neural Networks
    Wiehman, Stiaan
    Kroon, Steve
    de Villiers, Hendrik
    [J]. 2016 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS INTERNATIONAL CONFERENCE (PRASA-ROBMECH), 2016,