MODEL SPIDER: Learning to Rank Pre-Trained Models Efficiently

被引:0
|
作者
Zhang, Yi-Kai [1 ]
Huang, Ting-Ji [1 ]
Ding, Yao-Xiang [2 ]
Zhan, De-Chuan [1 ]
Ye, Han-Jia [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Zhejiang Univ, State Key Lab CAD & CG, Hangzhou, Peoples R China
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Figuring out which Pre-Trained Model (PTM) from a model zoo fits the target task is essential to take advantage of plentiful model resources. With the availability of numerous heterogeneous PTMs from diverse fields, efficiently selecting the most suitable one is challenging due to the time-consuming costs of carrying out forward or backward passes over all PTMs. In this paper, we propose MODEL SPIDER, which tokenizes both PTMs and tasks by summarizing their characteristics into vectors to enable efficient PTM selection. By leveraging the approximated performance of PTMs on a separate set of training tasks, MODEL SPIDER learns to construct representation and measure the fitness score between a model-task pair via their representation. The ability to rank relevant PTMs higher than others generalizes to new tasks. With the top-ranked PTM candidates, we further learn to enrich task repr. with their PTM-specific semantics to re-rank the PTMs for better selection. MODEL SPIDER balances efficiency and selection ability, making PTM selection like a spider preying on a web. MODEL SPIDER exhibits promising performance across diverse model zoos, including visual models and Large Language Models (LLMs). Code is available at https://github.com/zhangyikaii/Model-Spider.
引用
收藏
页数:28
相关论文
共 50 条
  • [1] Efficiently Robustify Pre-Trained Models
    Jain, Nishant
    Behl, Harkirat
    Rawat, Yogesh Singh
    Vineet, Vibhav
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5482 - 5492
  • [2] Learning to Modulate pre-trained Models in RL
    Schmied, Thomas
    Hofmarcher, Markus
    Paischer, Fabian
    Pascanu, Razvan
    Hochreiter, Sepp
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [3] Continual Learning with Pre-Trained Models: A Survey
    Zhou, Da-Wei
    Sun, Hai-Long
    Ning, Jingyi
    Ye, Han-Jia
    Zhan, De-Chuan
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 8363 - 8371
  • [4] Efficiently Gluing Pre-Trained Language and Vision Models for Image Captioning
    Song, Peipei
    Zhou, Yuanen
    Liu, Daqing
    Yang, Xun
    Wang, Depeng
    Hu, Zhenzhen
    Wang, Meng
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (06)
  • [5] Efficiently Adapting Traffic Pre-trained Models for Encrypted Traffic Classification
    Lu, Wenxuan
    Lv, Zhuohang
    Yang, Lanqi
    Luo, Xiang
    Zang, Tianning
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 2828 - 2833
  • [6] Towards Inadequately Pre-trained Models in Transfer Learning
    Deng, Andong
    Li, Xingjian
    Hu, Di
    Wang, Tianyang
    Xiong, Haoyi
    Xu, Cheng-Zhong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19340 - 19351
  • [7] Transfer learning with pre-trained conditional generative models
    Yamaguchi, Shin'ya
    Kanai, Sekitoshi
    Kumagai, Atsutoshi
    Chijiwa, Daiki
    Kashima, Hisashi
    MACHINE LEARNING, 2025, 114 (04)
  • [8] Sparse Low-rank Adaptation of Pre-trained Language Models
    Ding, Ning
    Lv, Xingtai
    Wang, Qiaosen
    Chen, Yulin
    Zhou, Bowen
    Liu, Zhiyuan
    Sun, Maosong
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 4133 - 4145
  • [9] PTMA: Pre-trained Model Adaptation for Transfer Learning
    Li, Xiao
    Yan, Junkai
    Jiang, Jianjian
    Zheng, Wei-Shi
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, KSEM 2024, 2024, 14884 : 176 - 188
  • [10] Federated Learning from Pre-Trained Models: A Contrastive Learning Approach
    Tan, Yue
    Long, Guodong
    Ma, Jie
    Liu, Lu
    Zhou, Tianyi
    Jiang, Jing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,