HyRSM plus plus : Hybrid relation guided temporal set matching for few-shot action recognition

被引:10
|
作者
Wang, Xiang [1 ]
Zhang, Shiwei [2 ]
Qing, Zhiwu [1 ]
Zuo, Zhengrong [1 ]
Gao, Changxin [1 ]
Jin, Rong [3 ]
Sang, Nong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automation, Key Lab, Minist Educ Image Proc & Intelligent Control, Wuhan, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
[3] Meta AI, Medford, MA USA
基金
中国国家自然科学基金;
关键词
Few-shot action recognition; Set matching; Semi-supervised few-shot action recognition; Unsupervised few-shot action recognition;
D O I
10.1016/j.patcog.2023.110110
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot action recognition is a challenging but practical problem aiming to learn a model that can be easily adapted to identify new action categories with only a few labeled samples. However, existing attempts still suffer from two drawbacks: (i) learning individual features without considering the entire task may result in limited representation capability, and (ii) existing alignment strategies are sensitive to noises and misaligned instances. To handle the two limitations, we propose a novel Hybrid Relation guided temporal Set Matching (HyRSM++) approach for few-shot action recognition. The core idea of HyRSM++ is to integrate all videos within the task to learn discriminative representations and involve a robust matching technique. To be specific, HyRSM++ consists of two key components, a hybrid relation module and a temporal set matching metric. Given the basic representations from the feature extractor, the hybrid relation module is introduced to fully exploit associated relations within and cross videos in an episodic task and thus can learn task-specific embeddings. Subsequently, in the temporal set matching metric, we carry out the distance measure between query and support videos from a set matching perspective and design a bidirectional Mean Hausdorff Metric to improve the resilience to misaligned instances. Furthermore, we extend the proposed HyRSM++ to deal with the more challenging semi-supervised few-shot action recognition and unsupervised few-shot action recognition tasks. Experimental results on multiple benchmarks demonstrate that our method consistently outperforms existing methods and achieves state-of-the-art performance under various few-shot settings. The source code is available at https://github.com/alibaba-mmai-research/HyRSMPlusPlus.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] SMAM: Self and Mutual Adaptive Matching for Skeleton-Based Few-Shot Action Recognition
    Li, Zhiheng
    Gong, Xuyuan
    Song, Ran
    Duan, Peng
    Liu, Jun
    Zhang, Wei
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 392 - 402
  • [42] Few-shot Open-set Recognition by Transformation Consistency
    Jeong, Minki
    Choi, Seokeon
    Kim, Changick
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12561 - 12570
  • [43] Boosting Few-Shot Open-Set Recognition with Multi-Relation Margin Loss
    Che, Yongjuan
    An, Yuexuan
    Xue, Hui
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 3505 - 3513
  • [44] HMNet: Hybrid Matching Network for Few-Shot Link Prediction
    Xiao, Shan
    Duan, Lei
    Xie, Guicai
    Li, Renhao
    Chen, Zihao
    Deng, Geng
    Nummenmaa, Jyrki
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT I, 2021, 12681 : 307 - 322
  • [45] Convolutional Self-attention Guided Graph Neural Network for Few-Shot Action Recognition
    Pan, Fei
    Guo, Jie
    Guo, Yanwen
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT II, 2023, 14087 : 401 - 412
  • [46] Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition
    Liu, Huabin
    Lv, Weixian
    See, John
    Lin, Weiyao
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 6230 - 6240
  • [47] Two-Stream Temporal Feature Aggregation Based on Clustering for Few-Shot Action Recognition
    Deng, Long
    Li, Ao
    Zhou, Bingxin
    Ge, Yongxin
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2435 - 2439
  • [48] Composite Object Relation Modeling for Few-Shot Scene Recognition
    Song, Xinhang
    Liu, Chenlong
    Zeng, Haitao
    Zhu, Yaohui
    Chen, Gongwei
    Qin, Xiaorong
    Jiang, Shuqiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5678 - 5691
  • [49] Joint image-instance spatial-temporal attention for few-shot action recognition
    Qian, Zefeng
    Zhang, Chongyang
    Huang, Yifei
    Wang, Gang
    Ying, Jiangyong
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2025, 254
  • [50] Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition
    Wanyan, Yuyang
    Yang, Xiaoshan
    Chen, Chaofan
    Xu, Changsheng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6492 - 6502