Label Propagation for Zero-shot Classification with Vision-Language Models

被引:0
|
作者
Stojnic, Vladan [1 ]
Kalantidis, Yannis [2 ]
Tolias, Giorgos [1 ]
机构
[1] Czech Tech Univ, FEE, VRG, Prague, Czech Republic
[2] NAVER LABS Europe, Meylan, France
关键词
D O I
10.1109/CVPR52733.2024.02190
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vision-Language Models (VLMs) have demonstrated impressive performance on zero-shot classification, i.e. classification when provided merely with a list of class names. In this paper, we tackle the case of zero-shot classification in the presence of unlabeled data. We leverage the graph structure of the unlabeled data and introduce ZLaP, a method based on label propagation (LP) that utilizes geodesic distances for classification. We tailor LP to graphs containing both text and image features and further propose an efficient method for performing inductive inference based on a dual solution and a sparsification step. We perform extensive experiments to evaluate the effectiveness of our method on 14 common datasets and show that ZLaP outperforms the latest related works. Code: https://github.com/vladan-stojnic/ZLaP
引用
收藏
页码:23209 / 23218
页数:10
相关论文
共 50 条
  • [21] On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning?
    Zanella, Maxime
    Ben Ayed, Ismail
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 23783 - 23793
  • [22] Towards zero-shot human-object interaction detection via vision-language integration
    Xue, Weiying
    Liu, Qi
    Wang, Yuxiao
    Wei, Zhenao
    Xing, Xiaofen
    Xu, Xiangmin
    NEURAL NETWORKS, 2025, 187
  • [23] Label Augmentation for Zero-Shot Hierarchical Text Classification
    Paletto, Lorenzo
    Basile, Valerio
    Esposito, Roberto
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 7697 - 7706
  • [24] A Joint Label Space for Generalized Zero-Shot Classification
    Li, Jin
    Lan, Xuguang
    Long, Yang
    Liu, Yang
    Chen, Xingyu
    Shao, Ling
    Zheng, Nanning
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 5817 - 5831
  • [25] Large Language Models are Zero-Shot Reasoners
    Kojima, Takeshi
    Gu, Shixiang Shane
    Reid, Machel
    Matsuo, Yutaka
    Iwasawa, Yusuke
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [26] Language Models as Zero-Shot Trajectory Generators
    Kwon, Teyun
    Di Palo, Norman
    Johns, Edward
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (07): : 6728 - 6735
  • [27] Few-Shot Image Classification of Crop Diseases Based on Vision-Language Models
    Zhou, Yueyue
    Yan, Hongping
    Ding, Kun
    Cai, Tingting
    Zhang, Yan
    SENSORS, 2024, 24 (18)
  • [28] Zero-Shot Facial Expression Recognition with Multi-label Label Propagation
    Lu, Zijia
    Zeng, Jiabei
    Shan, Shiguang
    Chen, Xilin
    COMPUTER VISION - ACCV 2018, PT III, 2019, 11363 : 19 - 34
  • [29] Zero-shot urban function inference with street view images through prompting a pretrained vision-language model
    Huang, Weiming
    Wang, Jing
    Cong, Gao
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2024, 38 (07) : 1414 - 1442
  • [30] Vision-Language Pretraining for Variable-Shot Image Classification
    Papadopoulos, Sotirios
    Ioannidis, Konstantinos
    Vrochidis, Stefanos
    Kompatsiaris, Ioannis
    Patras, Ioannis
    MULTIMEDIA MODELING, MMM 2025, PT IV, 2025, 15523 : 283 - 297