MULTI-SCALE TEMPORAL FEATURE FUSION FOR FEW-SHOT ACTION RECOGNITION

被引:0
|
作者
Lee, Jun-Tae [1 ]
Yun, Sungrack [1 ]
机构
[1] Qualcomm AI Res, Initiat Qualcomm Technol Inc, Qualcomm Korea YH, Seoul, South Korea
关键词
Few-shot learning; Few-shot action; video representation; temporal fusion; cross-attention;
D O I
10.1109/ICIP49359.2023.10223132
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The aim of this paper is to recognize actions of interest that are given by a few support videos in testing (query) videos. The focus of our approach is to develop a novel temporal enrichment module where the features describing local temporal contexts in videos are enhanced by collaboratively merging important information in frame-level (no temporal context) features. We call this module a multi-scale temporal feature fusion (MSTFF) module. Utilizing multiple MSTFF modules varying the scope of local temporal context extraction, we can obtain discriminative video representation which is crucial in the few-shot tasks where support videos are not sufficient to describe an action class. For stable learning of a model with MSTFF and the performance boost, we also learn a local temporal context-level auxiliary classifier in parallel with the main classifier. We analyze the proposed components to demonstrate their importance. We achieve state-of-the-art on three few-shot action recognition benchmarks: Something-Something V2 (SSv2), HMDB51, and Kinetics.
引用
收藏
页码:1785 / 1789
页数:5
相关论文
共 50 条
  • [41] Spatio-Temporal Self-supervision for Few-Shot Action Recognition
    Yu, Wanchuan
    Guo, Hanyu
    Yan, Yan
    Li, Jie
    Wang, Hanzi
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 84 - 96
  • [42] Temporal-Viewpoint Transportation Plan for Skeletal Few-Shot Action Recognition
    Wang, Lei
    Koniusz, Piotr
    [J]. COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 307 - 326
  • [43] Triple Channel Feature Fusion Few-Shot Intent Recognition With Orthogonality Constrained Multi-Head Attention
    Wu, Di
    Zheng, Yuying
    Cheng, Peng
    [J]. IEEE ACCESS, 2024, 12 : 31685 - 31696
  • [44] Elimination of Non-Novel Segments at Multi-Scale for Few-Shot Segmentation
    Kayabas, Alper
    Tufekc, Gulin
    Ulusoy, Ilkay
    [J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2558 - 2566
  • [45] Multi-scale kronecker-product relation networks for few-shot learning
    Mounir Abdelaziz
    Zuping Zhang
    [J]. Multimedia Tools and Applications, 2022, 81 : 6703 - 6722
  • [46] Multi-scale Attention-Based Few-Shot Hyperspectral Images Classification
    Ding, Lanwei
    Cao, Guo
    Xu, Ling
    Deng, Lindiao
    Xu, Hao
    Pan, Qikun
    Shang, Yanfeng
    [J]. FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
  • [47] Relational multi-scale metric learning for few-shot knowledge graph completion
    Song, Yu
    Gui, Mingyu
    Zhang, Kunli
    Xu, Zexi
    Dai, Dongming
    Kong, Dezhi
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (07) : 4125 - 4150
  • [48] MARANet: Multi-scale Adaptive Region Attention Network for Few-Shot Learning
    Chen, Jia
    Li, Xiyang
    Ou, Yangjun
    Hu, Xinrong
    Peng, Tao
    [J]. ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT I, 2024, 14495 : 415 - 426
  • [49] Multi-scale kronecker-product relation networks for few-shot learning
    Abdelaziz, Mounir
    Zhang, Zuping
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (05) : 6703 - 6722
  • [50] Few-shot object detection on Thangka via multi-scale context information
    Hu, Wenjin
    Tang, Huiyuan
    Yue, Chaoyang
    Song, Huafei
    [J]. Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2023, 31 (12): : 1859 - 1869