Video Attribute Prototype Network: A New Perspective for Zero-Shot Video Classification

被引:0
|
作者
Wang, Bo [1 ]
Zhao, Kaili [1 ]
Zhao, Hongyang [1 ]
Pu, Shi
Xiao, Bo [1 ]
Guo, Jun [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
关键词
D O I
10.1109/ICCVW60793.2023.00039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video attributes, which leverage video contents to instantiate class semantics, play a critical role in diversifying semantics in zero-shot video classification, thereby facilitating semantic transfer from seen to unseen classes. However, few presences discuss video attributes, and most methods consider class names as class semantics that tend to be loosely defined. In this paper, we propose a Video Attribute Prototype Network (VAPNet) to generate video attributes that learns in-context semantics between video captions and class semantics. Specifically, we introduce a cross-attention module in the Transformer decoder by considering video captions as queries to probe and pool semantic-associated class-wise features. To alleviate noises in pre-extracted captions, we learn caption features through a stochastic representation derived from a Gaussian representation where the variance encodes uncertainties. We utilize a joint video-to-attribute and video-to-video contrastive loss to calibrate visual and semantic features. Experiments show that VAPNet significantly outperforms SoTA by relative improvements of 14.3% on UCF101 and 8.8% on HMDB51, and further surpasses the pre-trained vision-language SoTA by 4.1% and 17.2%. Code is available.
引用
收藏
页码:315 / 324
页数:10
相关论文
共 50 条
  • [1] Zero-shot Micro-video Classification with Neural Variational Inference in Graph Prototype Network
    Chen, Junyang
    Wang, Jialong
    Dai, Zhijiang
    Wu, Huisi
    Wang, Mengzhu
    Zhang, Qin
    Wang, Huan
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 966 - 974
  • [2] Learning to Model Relationships for Zero-Shot Video Classification
    Gao, Junyu
    Zhang, Tianzhu
    Xu, Changsheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (10) : 3476 - 3491
  • [3] Zero-shot Node Classification with Decomposed Graph Prototype Network
    Wang, Zheng
    Wang, Jialong
    Guo, Yuchen
    Gong, Zhiguo
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 1769 - 1779
  • [4] ATTRIBUTE DRIVEN ZERO-SHOT CLASSIFICATION AND SEGMENTATION
    Yang, Shu
    Shi, Yemin
    Wang, Yaowei
    Wang, Jing
    Fei, Zesong
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,
  • [5] Attribute relation learning for zero-shot classification
    Liu, Mingxia
    Zhang, Daoqiang
    Chen, Songcan
    NEUROCOMPUTING, 2014, 139 : 34 - 46
  • [6] Zero-Shot Image Classification Based on Attribute
    Zhang, Wei
    Chen, Wenbai
    Chen, Xiangfeng
    Han, Hu
    2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, : 25 - 30
  • [7] Zero-shot classification with unseen prototype learning
    Ji, Zhong
    Cui, Biying
    Yu, Yunlong
    Pang, Yanwei
    Zhang, Zhongfei
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (17): : 12307 - 12317
  • [8] Zero-shot classification with unseen prototype learning
    Zhong Ji
    Biying Cui
    Yunlong Yu
    Yanwei Pang
    Zhongfei Zhang
    Neural Computing and Applications, 2023, 35 : 12307 - 12317
  • [9] Visual Data Synthesis via GAN for Zero-Shot Video Classification
    Zhang, Chenrui
    Peng, Yuxin
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 1128 - 1134
  • [10] Zero-Shot Video Classification Combined with 3D DenseNet
    Yin M.
    Zhao X.
    Guo S.
    Chen Z.
    Zhang J.
    Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University, 2023, 48 (03): : 480 - 488