HyRSM plus plus : Hybrid relation guided temporal set matching for few-shot action recognition

被引:10
|
作者
Wang, Xiang [1 ]
Zhang, Shiwei [2 ]
Qing, Zhiwu [1 ]
Zuo, Zhengrong [1 ]
Gao, Changxin [1 ]
Jin, Rong [3 ]
Sang, Nong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automation, Key Lab, Minist Educ Image Proc & Intelligent Control, Wuhan, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
[3] Meta AI, Medford, MA USA
基金
中国国家自然科学基金;
关键词
Few-shot action recognition; Set matching; Semi-supervised few-shot action recognition; Unsupervised few-shot action recognition;
D O I
10.1016/j.patcog.2023.110110
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot action recognition is a challenging but practical problem aiming to learn a model that can be easily adapted to identify new action categories with only a few labeled samples. However, existing attempts still suffer from two drawbacks: (i) learning individual features without considering the entire task may result in limited representation capability, and (ii) existing alignment strategies are sensitive to noises and misaligned instances. To handle the two limitations, we propose a novel Hybrid Relation guided temporal Set Matching (HyRSM++) approach for few-shot action recognition. The core idea of HyRSM++ is to integrate all videos within the task to learn discriminative representations and involve a robust matching technique. To be specific, HyRSM++ consists of two key components, a hybrid relation module and a temporal set matching metric. Given the basic representations from the feature extractor, the hybrid relation module is introduced to fully exploit associated relations within and cross videos in an episodic task and thus can learn task-specific embeddings. Subsequently, in the temporal set matching metric, we carry out the distance measure between query and support videos from a set matching perspective and design a bidirectional Mean Hausdorff Metric to improve the resilience to misaligned instances. Furthermore, we extend the proposed HyRSM++ to deal with the more challenging semi-supervised few-shot action recognition and unsupervised few-shot action recognition tasks. Experimental results on multiple benchmarks demonstrate that our method consistently outperforms existing methods and achieves state-of-the-art performance under various few-shot settings. The source code is available at https://github.com/alibaba-mmai-research/HyRSMPlusPlus.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Hybrid Relation Guided Set Matching for Few-shot Action Recognition
    Wang, Xiang
    Zhang, Shiwei
    Qing, Zhiwu
    Tang, Mingqian
    Zuo, Zhengrong
    Gao, Changxin
    Jin, Rong
    Sang, Nong
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19916 - 19925
  • [2] Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching
    Xing, Jiazheng
    Wang, Mengmeng
    Ruan, Yudi
    Chen, Bofan
    Guo, Yaowei
    Mu, Boyu
    Dai, Guang
    Wang, Jingdong
    Liu, Yong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 1740 - 1750
  • [3] Spatio-temporal Relation Modeling for Few-shot Action Recognition
    Thatipelli, Anirudh
    Narayan, Sanath
    Khan, Salman
    Anwer, Rao Muhammad
    Khan, Fahad Shahbaz
    Ghanem, Bernard
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19926 - 19935
  • [4] Compound Prototype Matching for Few-Shot Action Recognition
    Huang, Yifei
    Yang, Lijin
    Sato, Yoichi
    COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 351 - 368
  • [5] Matching Compound Prototypes for Few-Shot Action Recognition
    Huang, Yifei
    Yang, Lijin
    Chen, Guo
    Zhang, Hongjie
    Lu, Feng
    Sato, Yoichi
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (09) : 3977 - 4002
  • [6] Semantic-Guided Relation Propagation Network for Few-shot Action Recognition
    Wang, Xiao
    Ye, Weirong
    Qi, Zhongang
    Zhao, Xun
    Wang, Guangge
    Shan, Ying
    Wang, Hanzi
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 816 - 825
  • [7] Elastic temporal alignment for few-shot action recognition
    Pan, Fei
    Xu, Chunlei
    Zhang, Hongjie
    Guo, Jie
    Guo, Yanwen
    IET COMPUTER VISION, 2023, 17 (01) : 39 - 50
  • [8] Hierarchical Task-aware Temporal Modeling and Matching for few-shot action recognition
    Zhan, Yucheng
    Pan, Yijun
    Wu, Siying
    Zhang, Yueyi
    Sun, Xiaoyan
    NEUROCOMPUTING, 2025, 624
  • [9] Semantic-guided spatio-temporal attention for few-shot action recognition
    Jianyu Wang
    Baolin Liu
    Applied Intelligence, 2024, 54 : 2458 - 2471
  • [10] Semantic-guided spatio-temporal attention for few-shot action recognition
    Wang, Jianyu
    Liu, Baolin
    APPLIED INTELLIGENCE, 2024, 54 (03) : 2458 - 2471