Knowledge-guided pre-training and fine-tuning: Video representation learning for action recognition

被引:3
|
作者
Wang, Guanhong [1 ,2 ]
Zhou, Yang [1 ]
He, Zhanhao [1 ]
Lu, Keyu [1 ]
Feng, Yang [3 ]
Liu, Zuozhu [1 ,2 ]
Wang, Gaoang [1 ,2 ]
机构
[1] Zhejiang Univ, Zhejiang Univ Univ Illinois Urbana Champaign Inst, Haining 314400, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[3] Angelalign Inc, Angelalign Res Inst, Shanghai 200011, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Video representation learning; Knowledge distillation; Action recognition; Video retrieval;
D O I
10.1016/j.neucom.2023.127136
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video-based action recognition is an important task in the computer vision community, aiming to extract rich spatial-temporal information to recognize human actions from videos. Many approaches adopt self-supervised learning in large-scale unlabeled datasets and exploit transfer learning in the downstream action recognition task. Though much progress has been made for action recognition with video representation learning, two main issues remain for most existing methods. Firstly, the pre-training with self-supervised pretext tasks usually learns neutral and not much informative representations for the downstream action recognition task. Secondly, the valuable learned knowledge from large-scaled pre-training datasets will be gradually forgotten in the fine-tuning stage. To address such issues, in this paper, we propose a novel video representation learning method with knowledge-guided pre-training and fine-tuning for action recognition, which incorporates external human parsing knowledge for generating informative representation in the pre-training, and preserves the pre-trained knowledge in the fine-tuning stage to avoid catastrophic forgetting via self-distillation. Our model, with contributions from the external human parsing knowledge, video-level contrastive learning, and knowledge preserving self-distillation, achieves state-of-the-art performance on two popular benchmarks, i.e., UCF101 and HMDB51, verifying the effectiveness of the proposed method.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction
    Su, Peng
    Vijay-Shanker, K.
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [32] Style Attuned Pre-training and Parameter Efficient Fine-tuning for Spoken Language Understanding
    Cao, Jin
    Wang, Jun
    Hamza, Wael
    Vanee, Kelly
    Li, Shang-Wen
    INTERSPEECH 2020, 2020, : 1570 - 1574
  • [33] Few-Shot Intent Detection via Contrastive Pre-Training and Fine-Tuning
    Zhang, Jian-Guo
    Bui, Trung
    Yoon, Seunghyun
    Chen, Xiang
    Liu, Zhiwei
    Xia, Congying
    Tran, Quan Hung
    Chang, Walter
    Yu, Philip
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1906 - 1912
  • [34] Robust Face Tracking Using Siamese-VGG with Pre-training and Fine-tuning
    Yuan, Shuo
    Yu, Xinguo
    Majid, Abdul
    2019 4TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS ENGINEERING (ICCRE), 2019, : 170 - 174
  • [35] Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning
    Yang, Jingyun
    Mark, Max Sobol
    Vu, Brandon
    Sharma, Archit
    Bohg, Jeannette
    Finn, Chelsea
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 4804 - 4811
  • [36] Foundations and Applications in Large-scale AI Models: Pre-training, Fine-tuning, and Prompt-based Learning
    Cheng, Derek
    Patel, Dhaval
    Pang, Linsey
    Mehta, Sameep
    Xie, Kexin
    Chi, Ed H.
    Liu, Wei
    Chawla, Nitesh
    Bailey, James
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 5853 - 5854
  • [37] Improving Knowledge Graph Representation Learning by Structure Contextual Pre-training
    Ye, Ganqiang
    Zhang, Wen
    Bi, Zhen
    Wong, Chi Man
    Chen, Hui
    Chen, Huajun
    PROCEEDINGS OF THE 10TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE GRAPHS (IJCKG 2021), 2021, : 151 - 155
  • [38] CODE: Contrastive Pre-training with Adversarial Fine-Tuning for Zero-Shot Expert Linking
    Chen, Bo
    Zhang, Jing
    Zhang, Xiaokang
    Tang, Xiaobin
    Cai, Lingfan
    Chen, Hong
    Li, Cuiping
    Zhang, Peng
    Tang, Jie
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11846 - 11854
  • [39] Trajectory-BERT: Pre-training and fine-tuning bidirectional transformers for crowd trajectory enhancement
    Li, Lingyu
    Huang, Tianyu
    Li, Yihao
    Li, Peng
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2023, 34 (3-4)
  • [40] Editorial for Special Issue on Large-scale Pre-training: Data, Models, and Fine-tuning
    Wen, Ji-Rong
    Huang, Zi
    Zhang, Hanwang
    MACHINE INTELLIGENCE RESEARCH, 2023, 20 (02) : 145 - 146