Distilling interaction knowledge for semi-supervised egocentric action recognition

被引：0

作者：

Wang, Haoran ^{[1
]}

Yang, Jiahao ^{[1
]}

Yu, Baosheng ^{[2
]}

Zhan, Yibing ^{[3
]}

Tao, Dapeng ^{[4
]}

Ling, Haibin ^{[5
]}

机构：

[1] College of Information Science and Engineering, Northeastern University, Shenyang,110819, China

[2] School of Computer Science, The University of Sydney, Darlington,NSW,2008, Australia

[3] JD Explore Academy, Beijing,100176, China

[4] School of Information Science and Engineering, Yunnan University, Yunnan, Kunming,650091, China

[5] Department of Computer Science, Stony Brook University, Stony Brook, United States

来源：

Pattern Recognition | 2025年 / 157卷

关键词：

Contrastive Learning - Semi-supervised learning - Video analysis;

D O I：

10.1016/j.patcog.2024.110927

中图分类号：

学科分类号：

摘要：

Egocentric action recognition, the identification of actions within video content obtained from a first-person perspective, is receiving increasing attention due to the widespread adoption of wearable camera technology. Nonetheless, the task of annotating actions within a video characterized by a cluttered background and the presence of various objects is labor-intensive. In this paper, we consider learning for egocentric action recognition in a semi-supervised manner. Inspired by the fact that videos captured from first-person viewpoint usually contain rich contents about how human hands interact with objects, we thus propose to employ a popular teacher–student framework and distill the interaction knowledge between hand and objects for semi-supervised egocentric action recognition. We refer to the proposed method as Interaction Knowledge Distillation or IKD. Specifically, the teacher network takes hands and action-related objects in the labeled videos as input, and uses graph neural networks to capture their spatial–temporal relations as graph edge features. The student network then takes the detected hands/objects from both labeled and unlabeled videos as input and mimics the teacher network to learn from the interactions to improve model performance. Experiments are performed on two popular egocentric action recognition datasets, Something-Something-V2 and EPIC-KITCHENS-100, which show that our proposed approach consistently outperforms recent state-of-the-art methods in typical semi-supervised settings. © 2024 Elsevier Ltd

引用

共 50 条

[1] Evaluation of semi-supervised learning method on action recognition
Shen, Haoquan
Yan, Yan
Xu, Shicheng
Ballas, Nicolas
Chen, Wenzhi
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (02) : 523 - 542
[2] Semi-supervised Hessian Eigenmap for Human Action Recognition
Ma, Xueqi
Pan, Jiaxing
Wang, Yue
Liu, Weifeng
[J]. INTELLIGENT VISUAL SURVEILLANCE (IVS 2016), 2016, 664 : 133 - 139
[3] Semi-Supervised Multiple Feature Analysis for Action Recognition
Wang, Sen
Ma, Zhigang
Yang, Yi
Li, Xue
Pang, Chaoyi
Hauptmann, Alexander G.
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2014, 16 (02) : 289 - 298
[4] Evaluation of semi-supervised learning method on action recognition
Haoquan Shen
Yan Yan
Shicheng Xu
Nicolas Ballas
Wenzhi Chen
[J]. Multimedia Tools and Applications, 2015, 74 : 523 - 542
[5] Semi-Supervised Action Recognition with Temporal Contrastive Learning
Singh, Ankit
Chakraborty, Omprakash
Varshney, Ashutosh
Panda, Rameswar
Feris, Rogerio
Saenko, Kate
Das, Abir
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10384 - 10394
[6] SVFormer: Semi-supervised Video Transformer for Action Recognition
Xing, Zhen
Dai, Qi
Hu, Han
Chen, Jingjing
Wu, Zuxuan
Jiang, Yu-Gang
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18816 - 18826
[7] Domain Adaptable Normalization for Semi-Supervised Action Recognition in the Dark
Liang, Zixi
Chen, Jiajun
Chen, Rui
Zheng, Bingbing
Zhou, Mingyue
Gao, Huaien
Lin, Shan
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4250 - 4257
[8] GRA: Graph Representation Alignment for Semi-Supervised Action Recognition
Huang, Kuan-Hung
Huang, Yao-Bang
Lin, Yong-Xiang
Hua, Kai-Lung
Tanveer, M.
Lu, Xuequan
Razzak, Imran
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 11896 - 11905
[9] Learning from Temporal Gradient for Semi-supervised Action Recognition
Xiao, Junfei
Jing, Longlong
Zhang, Lin
He, Ju
She, Qi
Zhou, Zongwei
Yuille, Alan
Li, Yingwei
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3242 - 3252
[10] Action Recognition via Adaptive Semi-Supervised Feature Analysis
Xu, Zengmin
Li, Xiangli
Li, Jiaofen
Chen, Huafeng
Hu, Ruimin
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (13):

← 1 2 3 4 5 →