Knowledge-guided pre-training and fine-tuning: Video representation learning for action recognition

被引:3
|
作者
Wang, Guanhong [1 ,2 ]
Zhou, Yang [1 ]
He, Zhanhao [1 ]
Lu, Keyu [1 ]
Feng, Yang [3 ]
Liu, Zuozhu [1 ,2 ]
Wang, Gaoang [1 ,2 ]
机构
[1] Zhejiang Univ, Zhejiang Univ Univ Illinois Urbana Champaign Inst, Haining 314400, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[3] Angelalign Inc, Angelalign Res Inst, Shanghai 200011, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Video representation learning; Knowledge distillation; Action recognition; Video retrieval;
D O I
10.1016/j.neucom.2023.127136
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video-based action recognition is an important task in the computer vision community, aiming to extract rich spatial-temporal information to recognize human actions from videos. Many approaches adopt self-supervised learning in large-scale unlabeled datasets and exploit transfer learning in the downstream action recognition task. Though much progress has been made for action recognition with video representation learning, two main issues remain for most existing methods. Firstly, the pre-training with self-supervised pretext tasks usually learns neutral and not much informative representations for the downstream action recognition task. Secondly, the valuable learned knowledge from large-scaled pre-training datasets will be gradually forgotten in the fine-tuning stage. To address such issues, in this paper, we propose a novel video representation learning method with knowledge-guided pre-training and fine-tuning for action recognition, which incorporates external human parsing knowledge for generating informative representation in the pre-training, and preserves the pre-trained knowledge in the fine-tuning stage to avoid catastrophic forgetting via self-distillation. Our model, with contributions from the external human parsing knowledge, video-level contrastive learning, and knowledge preserving self-distillation, achieves state-of-the-art performance on two popular benchmarks, i.e., UCF101 and HMDB51, verifying the effectiveness of the proposed method.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models
    Thangarasa, Vithursan
    Gupta, Abhay
    Marshall, William
    Li, Tianda
    Leong, Kevin
    DeCoste, Dennis
    Lie, Sean
    Saxena, Shreyas
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2134 - 2146
  • [22] Fine-tuning and multilingual pre-training for abstractive summarization task for the Arabic language
    Kahla, Mram
    Novak, Attila
    Yang, Zijian Gyozo
    ANNALES MATHEMATICAE ET INFORMATICAE, 2023, 57 : 24 - 35
  • [23] MISS: A Generative Pre-training and Fine-Tuning Approach for Med-VQA
    Chen, Jiawei
    Yang, Dingkang
    Jiang, Yue
    Lei, Yuxuan
    Zhang, Lihua
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT VIII, 2024, 15023 : 299 - 313
  • [24] FACTPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization
    Wan, David
    Bansal, Mohit
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1010 - 1028
  • [25] Evaluation of pre-training impact on fine-tuning for remote sensing scene classification
    Yuan, Man
    Liu, Zhi
    Wang, Fan
    REMOTE SENSING LETTERS, 2019, 10 (01) : 49 - 58
  • [26] Fine-tuning and multilingual pre-training for abstractive summarization task for the Arabic language
    Kahla, Mram
    Novak, Attila
    Yang, Zijian Gyozo
    ANNALES MATHEMATICAE ET INFORMATICAE, 2023, 57 : 24 - 35
  • [27] Pre-training and Fine-tuning Neural Topic Model: A Simple yet Effective Approach to Incorporating External Knowledge
    Zhang, Linhai
    Hu, Xumeng
    Wang, Boyu
    Zhou, Deyu
    Zhang, Qian-Wen
    Cao, Yunbo
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5980 - 5989
  • [28] Breaking the Barrier Between Pre-training and Fine-tuning: A Hybrid Prompting Model for Knowledge-Based VQA
    Sun, Zhongfan
    Hu, Yongli
    Gao, Qingqing
    Jiang, Huajie
    Gao, Junbin
    Sun, Yanfeng
    Yin, Baocai
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4065 - 4073
  • [29] KID-Review: Knowledge-Guided Scientific Review Generation with Oracle Pre-training
    Yuan, Weizhe
    Liu, Pengfei
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11639 - 11647
  • [30] Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction
    Peng Su
    K. Vijay-Shanker
    BMC Bioinformatics, 23