How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models?

被引:4
|
作者
Ming, Yifei [1 ]
Li, Yixuan [1 ]
机构
[1] Univ Wisconsin Madison, Dept Comp Sci, Madison, WI 53715 USA
基金
美国国家科学基金会;
关键词
CLIP; OOD detection; Fine-tuning; Multi-modality; Vision-language models; Prompt learning; Few-shot learning; Adaptor; BLIND DECONVOLUTION; IDENTIFIABILITY; KERNEL; NOISE;
D O I
10.1007/s11263-023-01895-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent large vision-language models such as CLIP have shown remarkable out-of-distribution (OOD) detection and generalization performance. However, their zero-shot in-distribution (ID) accuracy is often limited for downstream datasets. Recent CLIP-based fine-tuning methods such as prompt learning have demonstrated significant improvements in ID classification and OOD generalization where OOD labels are available. Nonetheless, it remains unclear whether the model is reliable to semantic shifts without OOD labels. In this paper, we aim to bridge the gap and present a comprehensive study to understand how fine-tuning impact OOD detection for few-shot downstream tasks. By framing OOD detection as multi-modal concept matching, we establish a connection between fine-tuning methods and various OOD scores. Our results suggest that a proper choice of OOD scores is essential for CLIP-based fine-tuning. In particular, the maximum concept matching (MCM) score provides a promising solution consistently. We also show that prompt learning demonstrates the state-of-the-art OOD detection performance over the zero-shot counterpart.
引用
收藏
页码:596 / 609
页数:14
相关论文
共 50 条
  • [1] How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models?
    Yifei Ming
    Yixuan Li
    [J]. International Journal of Computer Vision, 2024, 132 : 596 - 609
  • [2] Robust Fine-Tuning of Vision-Language Models for Domain Generalization
    Vogt-Lowell, Kevin
    Lee, Noah
    Tsiligkaridis, Theodoros
    Vaillant, Marc
    [J]. 2023 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE, HPEC, 2023,
  • [3] Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models
    Zhou, Andy
    Wang, Jindong
    Wang, Yu-Xiong
    Wang, Haohan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data
    Kong, Lingkai
    Jiang, Haoming
    Zhuang, Yuchen
    Lyu, Jie
    Zhao, Tuo
    Zhang, Chao
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1326 - 1340
  • [5] A survey of efficient fine-tuning methods for Vision-Language Models - Prompt and Adapter
    Xing, Jialu
    Liu, Jianping
    Wang, Jian
    Sun, Lulu
    Chen, Xi
    Gu, Xunxun
    Wang, Yingfei
    [J]. COMPUTERS & GRAPHICS-UK, 2024, 119
  • [6] Fine-tuning vs From Scratch: Do Vision & Language Models Have Similar Capabilities on Out-of-Distribution Visual Question Answering?
    Jensen, Kristian Norgaard
    Plank, Barbara
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1496 - 1508
  • [7] Distilling Large Vision-Language Model with Out-of-Distribution Generalizability
    Li, Xuanlin
    Fang, Yunhao
    Liu, Minghua
    Ling, Zhan
    Tu, Zhuowen
    Su, Hao
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2492 - 2503
  • [8] Foundation Models and Fine-Tuning: A Benchmark for Out of Distribution Detection
    Cappio Borlino, Francesco
    Lu, Lorenzo
    Tommasi, Tatiana
    [J]. IEEE ACCESS, 2024, 12 : 79401 - 79414
  • [9] Distribution-Aware Prompt Tuning for Vision-Language Models
    Cho, Eulrang
    Kim, Jooyeon
    Kim, Hyunwoo J.
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21947 - 21956
  • [10] Fine-Tuning Deteriorates General Textual Out-of-Distribution Detection by Distorting Task-Agnostic Features
    Chen, Sishuo
    Yang, Wenkai
    Bi, Xiaohan
    Sun, Xu
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 564 - 579