Robust fine-tuning of zero-shot models

被引:86
|
作者
Wortsman, Mitchell [1 ]
Ilharco, Gabriel [1 ]
Kim, Jong Wook [2 ]
Li, Mike [3 ]
Kornblith, Simon [4 ]
Roelofs, Rebecca [4 ]
Lopes, Raphael Gontijo [4 ]
Hajishirzi, Hannaneh [1 ]
Farhadi, Ali [1 ]
Namkoong, Hongseok [3 ]
Schmidt, Ludwig [1 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
[2] OpenAI, San Francisco, CA USA
[3] Columbia Univ, New York, NY USA
[4] Google Res, Brain Team, Toronto, ON, Canada
关键词
D O I
10.1109/CVPR52688.2022.00780
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning methods substantially improve accuracy on a given target distribution, they often reduce robustness to distribution shifts. We address this tension by introducing a simple and effective method for improving robustness while fine-tuning: ensembling the weights of the zero-shot and fine-tuned models (WiSE-FT). Compared to standard fine-tuning, WiSE-FT provides large accuracy improvements under distribution shift, while preserving high accuracy on the target distribution. On ImageNet and five derived distribution shifts, WiSE-FT improves accuracy under distribution shift by 4 to 6 percentage points (pp) over prior work while increasing ImageNet accuracy by 1.6 pp. WiSE-FT achieves similarly large robustness gains (2 to 23 pp) on a diverse set of six further distribution shifts, and accuracy gains of 0.8 to 3.3 pp compared to standard fine-tuning on commonly used transfer learning datasets. These improvements come at no additional computational cost during fine-tuning or inference.
引用
收藏
页码:7949 / 7961
页数:13
相关论文
共 50 条
  • [1] Feature fine-tuning and attribute representation transformation for zero-shot learning
    Pang, Shanmin
    He, Xin
    Hao, Wenyu
    Long, Yang
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 236
  • [2] Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling
    Mueller, Aaron
    Dredze, Mark
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3054 - 3068
  • [3] Towards Zero-Shot Conditional Summarization with Adaptive Multi-Task Fine-Tuning
    Goodwin, Travis R.
    Savery, Max E.
    Demner-Fushman, Dina
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
  • [4] CODE: Contrastive Pre-training with Adversarial Fine-Tuning for Zero-Shot Expert Linking
    Chen, Bo
    Zhang, Jing
    Zhang, Xiaokang
    Tang, Xiaobin
    Cai, Lingfan
    Chen, Hong
    Li, Cuiping
    Zhang, Peng
    Tang, Jie
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11846 - 11854
  • [5] An Empirical Evaluation of the Zero-Shot, Few-Shot, and Traditional Fine-Tuning Based Pretrained Language Models for Sentiment Analysis in Software Engineering
    Shafikuzzaman, Md
    Islam, Md Rakibul
    Rolli, Alex C.
    Akhter, Sharmin
    Seliya, Naeem
    [J]. IEEE ACCESS, 2024, 12 : 109714 - 109734
  • [6] Domain-Oriented Prefix-Tuning: Towards Efficient and Generalizable Fine-tuning for Zero-Shot Dialogue Summarization
    Zhao, Lulu
    Zheng, Fujia
    Zeng, Weihao
    He, Keqing
    Xu, Weiran
    Jiang, Huixing
    Wu, Wei
    Wu, Yanan
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4848 - 4862
  • [7] Center-VAE with discriminative and semantic-relevant fine-tuning features for generalized zero-shot learning
    Zhai, Zhibo
    Li, Xiao
    Chang, Zhonghao
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 111
  • [8] Zero-shot language extension for dialogue state tracking via pre-trained models and multi-auxiliary-tasks fine-tuning
    Xiang, Lu
    Zhao, Yang
    Zhu, Junnan
    Zhou, Yu
    Zong, Chengqing
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 259
  • [9] Robust Test-Time Adaptation for Zero-Shot Prompt Tuning
    Zhang, Ding-Chu
    Zhou, Zhi
    Li, Yu-Feng
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 15, 2024, : 16714 - 16722
  • [10] How robust are discriminatively trained zero-shot learning models?
    Yucel, Mehmet Kerim
    Cinbis, RamazanGokberk
    Duygulu, Pinar
    [J]. IMAGE AND VISION COMPUTING, 2022, 119