Few-shot Action Recognition via Multi-view Representation Learning

被引:0
|
作者
Wang X. [1 ]
Lu Y. [1 ]
Yu W. [1 ]
Pang Y. [2 ]
Wang H. [1 ]
机构
[1] School of Informatics, Fujian Key Laboratory of Sensing and Computing for Smart City, Xiamen University, Xiamen
[2] School of Electrical and Information Engineering, Tianjin Key Laboratory of Brain-Inspired Intelligence Technology, Tianjin University, Tianjin
基金
中国国家自然科学基金;
关键词
action recognition; Circuits and systems; Convolution; Few-shot learning; meta-learning; multi-view representation learning; Prototypes; Representation learning; Task analysis; Three-dimensional displays; Training;
D O I
10.1109/TCSVT.2024.3384875
中图分类号
学科分类号
摘要
Few-shot action recognition aims to recognize novel action classes with limited labeled samples and has recently received increasing attention. The core objective of few-shot action recognition is to enhance the discriminability of feature representations. In this paper, we propose a novel multi-view representation learning network (MRLN) to model intra-video and inter-video relations for few-shot action recognition. Specifically, we first propose a spatial-aware aggregation refinement module (SARM), which mainly consists of a spatial-aware aggregation sub-module and a spatial-aware refinement sub-module to explore the spatial context of samples at the frame level. Then, we design a temporal-channel enhancement module (TCEM), which can capture the temporal-aware and channel-aware features of samples with the elaborately designed temporal-aware enhancement sub-module and channel-aware enhancement sub-module. Third, we introduce a cross-video relation module (CVRM), which can explore the relations across videos by utilizing the self-attention mechanism. Moreover, we design a prototype-centered mean absolute error loss to improve the feature learning capability of the proposed MRLN. Extensive experiments on four prevalent few-shot action recognition benchmarks show that the proposed MRLN can significantly outperform a variety of state-of-the-art few-shot action recognition methods. Especially, on the 5-way 1-shot setting, our MRLN respectively achieves 75.7%, 86.9%, 65.5% and 45.9% on the Kinetics, UCF101, HMDB51 and SSv2 datasets. IEEE
引用
收藏
页码:1 / 1
相关论文
共 50 条
  • [1] Few-shot Food Recognition via Multi-view Representation Learning
    Jiang, Shuqiang
    Min, Weiqing
    Lyu, Yongqiang
    Liu, Linhu
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2020, 16 (03)
  • [2] Few-Shot Partial Multi-View Learning
    Zhou Y.
    Guo Y.
    Hao S.
    Hong R.
    Luo J.
    [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45 (10) : 11824 - 11841
  • [3] Multi-view representation learning for multi-view action recognition
    Hao, Tong
    Wu, Dan
    Wang, Qian
    Sun, Jin-Sheng
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 48 : 453 - 460
  • [4] Jointly learning compact multi-view hash codes for few-shot FKP recognition
    Fei, Lunke
    Zhang, Bob
    Wen, Jie
    Teng, Shaohua
    Li, Shuyi
    Zhang, David
    [J]. PATTERN RECOGNITION, 2021, 115
  • [5] Multi-view Interaction Learning for Few-Shot Relation Classification
    Han, Yi
    Qiao, Linbo
    Zheng, Jianming
    Kan, Zhigang
    Gao, Yifu
    Feng, Linhui
    Tang, Yu
    Zhai, Qi
    Li, Dongsheng
    Liao, Xiangke
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 649 - 658
  • [6] Few-shot multi-view object classification via dual augmentation network
    Zhou, Yaqian
    Lu, Haochun
    Hao, Tong
    Li, Xuanya
    Liu, An-An
    [J]. INFORMATION FUSION, 2023, 100
  • [7] Neural representation and learning for multi-view human action recognition
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    [J]. 2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [8] Anomalous Action Recognition Research for Few-shot Learning
    Qi, Yufei
    Liu, Ting
    Fu, Yuzhuo
    [J]. PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 1306 - 1310
  • [9] Attention-Based Multi-View Feature Collaboration for Decoupled Few-Shot Learning
    Shao, Shuai
    Xing, Lei
    Wang, Yanjiang
    Liu, Baodi
    Liu, Weifeng
    Zhou, Yicong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2357 - 2369
  • [10] Ensembling Multi-View Discriminative Semantic Feature for Few-Shot Classification
    Xu, Rui
    Shao, Shuai
    Xing, Lei
    Wang, Yanjiang
    Liu, Baodi
    Liu, Weifeng
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132