Learning Implicit Temporal Alignment for Few-shot Video Classification

被引:0
|
作者
Zhang, Songyang [1 ,2 ,4 ]
Zhou, Jiale [1 ]
He, Xuming [1 ,3 ]
机构
[1] ShanghaiTech Univ, Shanghai, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Shanghai Engn Res Ctr Intelligent Vision & Imagin, Shanghai, Peoples R China
[4] Chinese Acad Sci, Shanghai Inst Microsyst & Informat Technol, Beijing, Peoples R China
来源
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021 | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot video classification aims to learn new video categories with only a few labeled examples, alleviating the burden of costly annotation in realworld applications. However, it is particularly challenging to learn a class-invariant spatial-temporal representation in such a setting. To address this, we propose a novel matching-based few-shot learning strategy for video sequences in this work. Our main idea is to introduce an implicit temporal alignment for a video pair, capable of estimating the similarity between them in an accurate and robust manner. Moreover, we design an effective context encoding module to incorporate spatial and feature channel context, resulting in better modeling of intra-class variations. To train our model, we develop a multi-task loss for learning video matching, leading to video features with better generalization. Extensive experimental results on two challenging benchmarks, show that our method outperforms the prior arts with a sizable margin on SomethingSomething-V2 and competitive results on Kinetics.
引用
收藏
页码:1309 / 1315
页数:7
相关论文
共 50 条
  • [1] Few-shot action recognition with implicit temporal alignment and pair similarity optimization
    Cao, Congqi
    Li, Yajuan
    Lv, Qinyi
    Wang, Peng
    Zhang, Yanning
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 210
  • [2] Few-Shot In-Context Imitation Learning via Implicit Graph Alignment
    Vosylius, Vitalis
    Johns, Edward
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
  • [3] Learning feature alignment and dual correlation for few-shot image classification
    Huang, Xilang
    Choi, Seon Han
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2024, 9 (02) : 303 - 318
  • [4] Few-Shot Classification with Contrastive Learning
    Yang, Zhanyuan
    Wang, Jinghua
    Zhu, Yingying
    COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 293 - 309
  • [5] Few-Shot Ensemble Learning for Video Classification with SlowFast Memory Networks
    Qi, Mengshi
    Qin, Jie
    Zhen, Xiantong
    Huang, Di
    Yang, Yi
    Luo, Jiebo
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3007 - 3015
  • [6] Few-Shot Video Classification via Representation Fusion and Promotion Learning
    Xia, Haifeng
    Li, Kai
    Min, Martin Renqiang
    Ding, Zhengming
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19254 - 19263
  • [7] Elastic temporal alignment for few-shot action recognition
    Pan, Fei
    Xu, Chunlei
    Zhang, Hongjie
    Guo, Jie
    Guo, Yanwen
    IET COMPUTER VISION, 2023, 17 (01) : 39 - 50
  • [8] Text-guided Graph Temporal Modeling for few-shot video classification
    Deng, Fuqin
    Zhong, Jiaming
    Li, Nannan
    Fu, Lanhui
    Jiang, Bingchun
    Yi, Ningbo
    Qi, Feng
    Xin, He
    Lam, Tin Lun
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
  • [9] Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments
    Nguyen, Khoi D.
    Quoc-Huy Tran
    Khoi Nguyen
    Binh-Son Hua
    Rang Nguyen
    COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 471 - 487
  • [10] Category Alignment Mechanism for Few-Shot Image Classification
    Zhou, Zhenyu
    Luo, Lei
    Liu, Tianrui
    Liao, Qing
    Liu, Xinwang
    Zhu, En
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 14