MODEL SPIDER: Learning to Rank Pre-Trained Models Efficiently

被引:0
|
作者
Zhang, Yi-Kai [1 ]
Huang, Ting-Ji [1 ]
Ding, Yao-Xiang [2 ]
Zhan, De-Chuan [1 ]
Ye, Han-Jia [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Zhejiang Univ, State Key Lab CAD & CG, Hangzhou, Peoples R China
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Figuring out which Pre-Trained Model (PTM) from a model zoo fits the target task is essential to take advantage of plentiful model resources. With the availability of numerous heterogeneous PTMs from diverse fields, efficiently selecting the most suitable one is challenging due to the time-consuming costs of carrying out forward or backward passes over all PTMs. In this paper, we propose MODEL SPIDER, which tokenizes both PTMs and tasks by summarizing their characteristics into vectors to enable efficient PTM selection. By leveraging the approximated performance of PTMs on a separate set of training tasks, MODEL SPIDER learns to construct representation and measure the fitness score between a model-task pair via their representation. The ability to rank relevant PTMs higher than others generalizes to new tasks. With the top-ranked PTM candidates, we further learn to enrich task repr. with their PTM-specific semantics to re-rank the PTMs for better selection. MODEL SPIDER balances efficiency and selection ability, making PTM selection like a spider preying on a web. MODEL SPIDER exhibits promising performance across diverse model zoos, including visual models and Large Language Models (LLMs). Code is available at https://github.com/zhangyikaii/Model-Spider.
引用
收藏
页数:28
相关论文
共 50 条
  • [41] Quality of Pre-trained Deep-Learning Models for Palmprint Recognition
    Rosca, Valentin
    Ignat, Anca
    2020 22ND INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2020), 2020, : 202 - 209
  • [42] Mass detection in mammograms using pre-trained deep learning models
    Agarwal, Richa
    Diaz, Oliver
    Llado, Xavier
    Marti, Robert
    14TH INTERNATIONAL WORKSHOP ON BREAST IMAGING (IWBI 2018), 2018, 10718
  • [43] An Approach to Run Pre-Trained Deep Learning Models on Grayscale Images
    Ahmad, Ijaz
    Shin, Seokjoo
    3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (IEEE ICAIIC 2021), 2021, : 177 - 180
  • [44] From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Models to Pre-trained Machine Reader
    Xu, Weiwen
    Li, Xin
    Zhang, Wenxuan
    Zhou, Meng
    Lam, Wai
    Si, Luo
    Bing, Lidong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [45] Hyperbolic Pre-Trained Language Model
    Chen, Weize
    Han, Xu
    Lin, Yankai
    He, Kaichen
    Xie, Ruobing
    Zhou, Jie
    Liu, Zhiyuan
    Sun, Maosong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3101 - 3112
  • [46] Annotating Columns with Pre-trained Language Models
    Suhara, Yoshihiko
    Li, Jinfeng
    Li, Yuliang
    Zhang, Dan
    Demiralp, Cagatay
    Chen, Chen
    Tan, Wang-Chiew
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 1493 - 1503
  • [47] Interpreting Art by Leveraging Pre-Trained Models
    Penzel, Niklas
    Denzler, Joachim
    2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
  • [48] Lottery Jackpots Exist in Pre-Trained Models
    Zhang, Yuxin
    Lin, Mingbao
    Zhong, Yunshan
    Chao, Fei
    Ji, Rongrong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 14990 - 15004
  • [49] LaoPLM: Pre-trained Language Models for Lao
    Lin, Nankai
    Fu, Yingwen
    Yang, Ziyu
    Chen, Chuwei
    Jiang, Shengyi
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6506 - 6512
  • [50] Generalization of vision pre-trained models for histopathology
    Sikaroudi, Milad
    Hosseini, Maryam
    Gonzalez, Ricardo
    Rahnamayan, Shahryar
    Tizhoosh, H. R.
    SCIENTIFIC REPORTS, 2023, 13 (01)