Learning Implicit Temporal Alignment for Few-shot Video Classification

被引：0

作者：

Zhang, Songyang ^{[1
,2
,4
]}

Zhou, Jiale ^{[1
]}

He, Xuming ^{[1
,3
]}

机构：

[1] ShanghaiTech Univ, Shanghai, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

[3] Shanghai Engn Res Ctr Intelligent Vision & Imagin, Shanghai, Peoples R China

[4] Chinese Acad Sci, Shanghai Inst Microsyst & Informat Technol, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021 | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Few-shot video classification aims to learn new video categories with only a few labeled examples, alleviating the burden of costly annotation in realworld applications. However, it is particularly challenging to learn a class-invariant spatial-temporal representation in such a setting. To address this, we propose a novel matching-based few-shot learning strategy for video sequences in this work. Our main idea is to introduce an implicit temporal alignment for a video pair, capable of estimating the similarity between them in an accurate and robust manner. Moreover, we design an effective context encoding module to incorporate spatial and feature channel context, resulting in better modeling of intra-class variations. To train our model, we develop a multi-task loss for learning video matching, leading to video features with better generalization. Extensive experimental results on two challenging benchmarks, show that our method outperforms the prior arts with a sizable margin on SomethingSomething-V2 and competitive results on Kinetics.

引用

页码：1309 / 1315

页数：7

共 50 条

[1] Few-shot action recognition with implicit temporal alignment and pair similarity optimization
Cao, Congqi
Li, Yajuan
Lv, Qinyi
Wang, Peng
Zhang, Yanning
COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 210
[2] Few-Shot In-Context Imitation Learning via Implicit Graph Alignment
Vosylius, Vitalis
Johns, Edward
CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
[3] Learning feature alignment and dual correlation for few-shot image classification
Huang, Xilang
Choi, Seon Han
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2024, 9 (02) : 303 - 318
[4] Few-Shot Classification with Contrastive Learning
Yang, Zhanyuan
Wang, Jinghua
Zhu, Yingying
COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 293 - 309
[5] Few-Shot Ensemble Learning for Video Classification with SlowFast Memory Networks
Qi, Mengshi
Qin, Jie
Zhen, Xiantong
Huang, Di
Yang, Yi
Luo, Jiebo
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3007 - 3015
[6] Few-Shot Video Classification via Representation Fusion and Promotion Learning
Xia, Haifeng
Li, Kai
Min, Martin Renqiang
Ding, Zhengming
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19254 - 19263
[7] Elastic temporal alignment for few-shot action recognition
Pan, Fei
Xu, Chunlei
Zhang, Hongjie
Guo, Jie
Guo, Yanwen
IET COMPUTER VISION, 2023, 17 (01) : 39 - 50
[8] Text-guided Graph Temporal Modeling for few-shot video classification
Deng, Fuqin
Zhong, Jiaming
Li, Nannan
Fu, Lanhui
Jiang, Bingchun
Yi, Ningbo
Qi, Feng
Xin, He
Lam, Tin Lun
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
[9] Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments
Nguyen, Khoi D.
Quoc-Huy Tran
Khoi Nguyen
Binh-Son Hua
Rang Nguyen
COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 471 - 487
[10] Category Alignment Mechanism for Few-Shot Image Classification
Zhou, Zhenyu
Luo, Lei
Liu, Tianrui
Liao, Qing
Liu, Xinwang
Zhu, En
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 14

← 1 2 3 4 5 →