Video Attribute Prototype Network: A New Perspective for Zero-Shot Video Classification

被引:0
|
作者
Wang, Bo [1 ]
Zhao, Kaili [1 ]
Zhao, Hongyang [1 ]
Pu, Shi
Xiao, Bo [1 ]
Guo, Jun [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
关键词
D O I
10.1109/ICCVW60793.2023.00039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video attributes, which leverage video contents to instantiate class semantics, play a critical role in diversifying semantics in zero-shot video classification, thereby facilitating semantic transfer from seen to unseen classes. However, few presences discuss video attributes, and most methods consider class names as class semantics that tend to be loosely defined. In this paper, we propose a Video Attribute Prototype Network (VAPNet) to generate video attributes that learns in-context semantics between video captions and class semantics. Specifically, we introduce a cross-attention module in the Transformer decoder by considering video captions as queries to probe and pool semantic-associated class-wise features. To alleviate noises in pre-extracted captions, we learn caption features through a stochastic representation derived from a Gaussian representation where the variance encodes uncertainties. We utilize a joint video-to-attribute and video-to-video contrastive loss to calibrate visual and semantic features. Experiments show that VAPNet significantly outperforms SoTA by relative improvements of 14.3% on UCF101 and 8.8% on HMDB51, and further surpasses the pre-trained vision-language SoTA by 4.1% and 17.2%. Code is available.
引用
收藏
页码:315 / 324
页数:10
相关论文
共 50 条
  • [41] Language-free Training for Zero-shot Video Grounding
    Kim, Dahye
    Park, Jungin
    Lee, Jiyoung
    Park, Seongheon
    Sohn, Kwanghoon
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2538 - 2547
  • [42] Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation
    Yuan, Yichen
    Wang, Yifan
    Wang, Lijun
    Zhao, Xiaoqi
    Lu, Huchuan
    Wang, Yu
    Su, Weibo
    Zhang, Lei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 966 - 976
  • [43] Zero-shot Learning With Fuzzy Attribute
    Liu, Chongwen
    Shang, Zhaowei
    Tang, Yuan Yan
    2017 3RD IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS (CYBCONF), 2017, : 277 - 282
  • [44] Attribute-Based Classification for Zero-Shot Visual Object Categorization
    Lampert, Christoph H.
    Nickisch, Hannes
    Harmeling, Stefan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (03) : 453 - 465
  • [45] Attribute-Based Zero-Shot Learning for Encrypted Traffic Classification
    Hu, Ying
    Cheng, Guang
    Chen, Wenchao
    Jiang, Bomiao
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2022, 19 (04): : 4583 - 4599
  • [46] CI-GNN: Building a Category-Instance Graph for Zero-Shot Video Classification
    Gao, Junyu
    Xu, Changsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (12) : 3088 - 3100
  • [47] Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
    Khachatryan, Levon
    Movsisyan, Andranik
    Tadevosyan, Vahram
    Henschel, Roberto
    Wang, Zhangyang
    Navasardyan, Shant
    Shi, Humphrey
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15908 - 15918
  • [48] Prototype rectification for zero-shot learning
    Yi, Yuanyuan
    Zeng, Guolei
    Ren, Bocheng
    Yang, Laurence T.
    Chai, Bin
    Li, Yuxin
    PATTERN RECOGNITION, 2024, 156
  • [49] Zero-shot Video Emotion Recognition via Multimodal Protagonist-aware Transformer Network
    Qi, Fan
    Yang, Xiaoshan
    Xu, Changsheng
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1074 - 1083
  • [50] A Robust Generalized Zero-Shot Learning Method with Attribute Prototype and Discriminative Attention Mechanism
    Liu, Xiaodong
    Luo, Weixing
    Du, Jiale
    Wang, Xinshuo
    Dang, Yuhao
    Liu, Yang
    ELECTRONICS, 2024, 13 (18)