Label Propagation for Zero-shot Classification with Vision-Language Models

被引:0
|
作者
Stojnic, Vladan [1 ]
Kalantidis, Yannis [2 ]
Tolias, Giorgos [1 ]
机构
[1] Czech Tech Univ, FEE, VRG, Prague, Czech Republic
[2] NAVER LABS Europe, Meylan, France
关键词
D O I
10.1109/CVPR52733.2024.02190
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vision-Language Models (VLMs) have demonstrated impressive performance on zero-shot classification, i.e. classification when provided merely with a list of class names. In this paper, we tackle the case of zero-shot classification in the presence of unlabeled data. We leverage the graph structure of the unlabeled data and introduce ZLaP, a method based on label propagation (LP) that utilizes geodesic distances for classification. We tailor LP to graphs containing both text and image features and further propose an efficient method for performing inductive inference based on a dual solution and a sparsification step. We perform extensive experiments to evaluate the effectiveness of our method on 14 common datasets and show that ZLaP outperforms the latest related works. Code: https://github.com/vladan-stojnic/ZLaP
引用
收藏
页码:23209 / 23218
页数:10
相关论文
共 50 条
  • [1] Vision-Language Models for Zero-Shot Classification of Remote Sensing Images
    Al Rahhal, Mohamad Mahmoud
    Bazi, Yakoub
    Elgibreen, Hebah
    Zuair, Mansour
    APPLIED SCIENCES-BASEL, 2023, 13 (22):
  • [2] Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
    Zheng, Zangwei
    Ma, Mingyuan
    Wang, Kai
    Qin, Ziheng
    Yue, Xiangyu
    You, Yang
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19068 - 19079
  • [3] MLTU: mixup long-tail unsupervised zero-shot image classification on vision-language models
    Jia, Yunpeng
    Ye, Xiufen
    Mei, Xinkui
    Liu, Yusong
    Guo, Shuxiang
    MULTIMEDIA SYSTEMS, 2024, 30 (03)
  • [4] Inference Calibration of Vision-Language Foundation Models for Zero-Shot and Few-Shot Learning
    Hu, Minyang
    Chang, Hong
    Shan, Shiguang
    Chen, Xilin
    PATTERN RECOGNITION LETTERS, 2025, 192 : 15 - 21
  • [5] Multiple Prompt Fusion for Zero-Shot Lesion Detection Using Vision-Language Models
    Guo, Miaotian
    Yi, Huahui
    Qin, Ziyuan
    Wang, Haiying
    Men, Aidong
    Lao, Qicheng
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT V, 2023, 14224 : 283 - 292
  • [6] Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
    Shu, Manli
    Nie, Weili
    Huang, De-An
    Yu, Zhiding
    Goldstein, Tom
    Anandkumar, Anima
    Xiao, Chaowei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [7] VLPSR: Enhancing Zero-Shot Object ReID with Vision-Language Model
    Hu, Mingzhe
    ADVANCES IN VISUAL COMPUTING, ISVC 2024, PT II, 2025, 15047 : 56 - 69
  • [8] CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment
    Javed, Sajid
    Mahmood, Arif
    Ganapathil, Iyyakutti Iyappan
    Dharej, Fayaz Ali
    Werghil, Naoufel
    Bennamoun, Mohammed
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 11450 - 11459
  • [9] VLFM: Vision-Language Frontier Maps for Zero-Shot Semantic Navigation
    Yokoyama, Naoki
    Ha, Sehoon
    Batra, Dhruv
    Wang, Jiuguang
    Bucher, Bernadette
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 42 - 48
  • [10] Zero-Shot Object Counting With Vision-Language Prior Guidance Network
    Zhai, Wenzhe
    Xing, Xianglei
    Gao, Mingliang
    Li, Qilei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2487 - 2498