MODEL SPIDER: Learning to Rank Pre-Trained Models Efficiently

被引:0
|
作者
Zhang, Yi-Kai [1 ]
Huang, Ting-Ji [1 ]
Ding, Yao-Xiang [2 ]
Zhan, De-Chuan [1 ]
Ye, Han-Jia [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Zhejiang Univ, State Key Lab CAD & CG, Hangzhou, Peoples R China
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Figuring out which Pre-Trained Model (PTM) from a model zoo fits the target task is essential to take advantage of plentiful model resources. With the availability of numerous heterogeneous PTMs from diverse fields, efficiently selecting the most suitable one is challenging due to the time-consuming costs of carrying out forward or backward passes over all PTMs. In this paper, we propose MODEL SPIDER, which tokenizes both PTMs and tasks by summarizing their characteristics into vectors to enable efficient PTM selection. By leveraging the approximated performance of PTMs on a separate set of training tasks, MODEL SPIDER learns to construct representation and measure the fitness score between a model-task pair via their representation. The ability to rank relevant PTMs higher than others generalizes to new tasks. With the top-ranked PTM candidates, we further learn to enrich task repr. with their PTM-specific semantics to re-rank the PTMs for better selection. MODEL SPIDER balances efficiency and selection ability, making PTM selection like a spider preying on a web. MODEL SPIDER exhibits promising performance across diverse model zoos, including visual models and Large Language Models (LLMs). Code is available at https://github.com/zhangyikaii/Model-Spider.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] Refining Pre-Trained Motion Models
    Sun, Xinglong
    Harley, Adam W.
    Guibas, Leonidas J.
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 4932 - 4938
  • [22] How to Estimate Model Transferability of Pre-Trained Speech Models?
    Chen, Zih-Ching
    Yang, Chao-Han Huck
    Li, Bo
    Zhang, Yu
    Chen, Nanxin
    Chang, Shou-Yiin
    Prabhavalkar, Rohit
    Lee, Hung-yi
    Sainath, Tara N.
    INTERSPEECH 2023, 2023, : 456 - 460
  • [23] TextPruner: A Model Pruning Toolkit for Pre-Trained Language Models
    Yang, Ziqing
    Cui, Yiming
    Chen, Zhigang
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2022, : 35 - 43
  • [24] Pre-trained Models for Sonar Images
    Valdenegro-Toro, Matias
    Preciado-Grijalva, Alan
    Wehbe, Bilal
    OCEANS 2021: SAN DIEGO - PORTO, 2021,
  • [25] Pre-Trained Language Models and Their Applications
    Wang, Haifeng
    Li, Jiwei
    Wu, Hua
    Hovy, Eduard
    Sun, Yu
    ENGINEERING, 2023, 25 : 51 - 65
  • [26] Backdoor Attacks Against Transfer Learning With Pre-Trained Deep Learning Models
    Wang, Shuo
    Nepal, Surya
    Rudolph, Carsten
    Grobler, Marthie
    Chen, Shangyu
    Chen, Tianle
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (03) : 1526 - 1539
  • [27] Learning and Evaluating a Differentially Private Pre-trained Language Model
    Hoory, Shlomo
    Feder, Amir
    Tendler, Avichai
    Cohen, Alon
    Erell, Sofia
    Laish, Itay
    Nakhost, Hootan
    Stemmer, Uri
    Benjamini, Ayelet
    Hassidim, Avinatan
    Matias, Yossi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1178 - 1189
  • [28] TransTailor: Pruning the Pre-trained Model for Improved Transfer Learning
    Liu, Bingyan
    Cai, Yifeng
    Guo, Yao
    Chen, Xiangqun
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8627 - 8634
  • [29] Pre-trained combustion model and transfer learning in thermoacoustic instability
    Qin, Ziyu
    Wang, Xinyao
    Han, Xiao
    Lin, Yuzhen
    Zhou, Yuchen
    PHYSICS OF FLUIDS, 2023, 35 (03)
  • [30] Classification of Regional Food Using Pre-Trained Transfer Learning Models
    Gadhiya, Jeet
    Khatik, Anjali
    Kodinariya, Shruti
    Ramoliya, Dipak
    7th International Conference on Electronics, Communication and Aerospace Technology, ICECA 2023 - Proceedings, 2023, : 1237 - 1241