Video Attribute Prototype Network: A New Perspective for Zero-Shot Video Classification

被引：0

作者：

Wang, Bo ^{[1
]}

Zhao, Kaili ^{[1
]}

Zhao, Hongyang ^{[1
]}

Pu, Shi

Xiao, Bo ^{[1
]}

Guo, Jun ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW | 2023年

关键词：

D O I：

10.1109/ICCVW60793.2023.00039

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Video attributes, which leverage video contents to instantiate class semantics, play a critical role in diversifying semantics in zero-shot video classification, thereby facilitating semantic transfer from seen to unseen classes. However, few presences discuss video attributes, and most methods consider class names as class semantics that tend to be loosely defined. In this paper, we propose a Video Attribute Prototype Network (VAPNet) to generate video attributes that learns in-context semantics between video captions and class semantics. Specifically, we introduce a cross-attention module in the Transformer decoder by considering video captions as queries to probe and pool semantic-associated class-wise features. To alleviate noises in pre-extracted captions, we learn caption features through a stochastic representation derived from a Gaussian representation where the variance encodes uncertainties. We utilize a joint video-to-attribute and video-to-video contrastive loss to calibrate visual and semantic features. Experiments show that VAPNet significantly outperforms SoTA by relative improvements of 14.3% on UCF101 and 8.8% on HMDB51, and further surpasses the pre-trained vision-language SoTA by 4.1% and 17.2%. Code is available.

引用

页码：315 / 324

页数：10

共 50 条

[1] Zero-shot Micro-video Classification with Neural Variational Inference in Graph Prototype Network
Chen, Junyang
Wang, Jialong
Dai, Zhijiang
Wu, Huisi
Wang, Mengzhu
Zhang, Qin
Wang, Huan
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 966 - 974
[2] Learning to Model Relationships for Zero-Shot Video Classification
Gao, Junyu
Zhang, Tianzhu
Xu, Changsheng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (10) : 3476 - 3491
[3] Zero-shot Node Classification with Decomposed Graph Prototype Network
Wang, Zheng
Wang, Jialong
Guo, Yuchen
Gong, Zhiguo
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 1769 - 1779
[4] ATTRIBUTE DRIVEN ZERO-SHOT CLASSIFICATION AND SEGMENTATION
Yang, Shu
Shi, Yemin
Wang, Yaowei
Wang, Jing
Fei, Zesong
2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,
[5] Attribute relation learning for zero-shot classification
Liu, Mingxia
Zhang, Daoqiang
Chen, Songcan
NEUROCOMPUTING, 2014, 139 : 34 - 46
[6] Zero-Shot Image Classification Based on Attribute
Zhang, Wei
Chen, Wenbai
Chen, Xiangfeng
Han, Hu
2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, : 25 - 30
[7] Zero-shot classification with unseen prototype learning
Ji, Zhong
Cui, Biying
Yu, Yunlong
Pang, Yanwei
Zhang, Zhongfei
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (17): : 12307 - 12317
[8] Zero-shot classification with unseen prototype learning
Zhong Ji
Biying Cui
Yunlong Yu
Yanwei Pang
Zhongfei Zhang
Neural Computing and Applications, 2023, 35 : 12307 - 12317
[9] Visual Data Synthesis via GAN for Zero-Shot Video Classification
Zhang, Chenrui
Peng, Yuxin
PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 1128 - 1134
[10] Zero-Shot Video Classification Combined with 3D DenseNet
Yin M.
Zhao X.
Guo S.
Chen Z.
Zhang J.
Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University, 2023, 48 (03): : 480 - 488

← 1 2 3 4 5 →