Few-Shot Learning of Compact Models via Task-Specific Meta Distillation

被引:2
|
作者
Wu, Yong [1 ]
Chanda, Shekhor [2 ]
Hosseinzadeh, Mehrdad [3 ]
Liu, Zhi [1 ]
Wang, Yang [4 ]
机构
[1] Shanghai Univ, Shanghai, Peoples R China
[2] Univ Manitoba, Winnipeg, MB, Canada
[3] Huawei Technol Canada, Markham, ON, Canada
[4] Concordia Univ, Montreal, PQ, Canada
基金
中国国家自然科学基金; 加拿大自然科学与工程研究理事会;
关键词
D O I
10.1109/WACV56688.2023.00620
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider a new problem of few-shot learning of compact models. Meta-learning is a popular approach for fewshot learning. Previous work in meta-learning typically assumes that the model architecture during meta-training is the same as the model architecture used for final deployment. In this paper, we challenge this basic assumption. For final deployment, we often need the model to be small. But small models usually do not have enough capacity to effectively adapt to new tasks. In the mean time, we often have access to the large dataset and extensive computing power during meta-training since meta-training is typically performed on a server. In this paper, we propose task-specific meta distillation that simultaneously learns two models in meta-learning: a large teacher model and a small student model. These two models are jointly learned during meta-training. Given a new task during meta-testing, the teacher model is first adapted to this task, then the adapted teacher model is used to guide the adaptation of the student model. The adapted student model is used for final deployment. We demonstrate the effectiveness of our approach in few-shot image classification using model-agnostic metal-earning (MAML). Our proposed method outperforms other alternatives on several benchmark datasets.
引用
收藏
页码:6254 / 6263
页数:10
相关论文
共 50 条
  • [21] Few-shot SAR target classification via meta-learning with hybrid models
    Geng, Qingtian
    Wang, Yaning
    Li, Qingliang
    FRONTIERS IN EARTH SCIENCE, 2024, 12
  • [22] Learning Meta Soft Prompt for Few-Shot Language Models
    Chien, Jen-Tzung
    Chen, Ming-Yen
    Xue, Jing-Hao
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 57 - 62
  • [23] Few-shot driver identification via meta-learning
    Lu, Lin
    Xiong, Shengwu
    Expert Systems with Applications, 2022, 203
  • [24] FEW-SHOT ACOUSTIC EVENT DETECTION VIA META LEARNING
    Shi, Bowen
    Sun, Ming
    Puvvada, Krishna C.
    Kao, Chieh-Chi
    Matsoukas, Spyros
    Wang, Chao
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 76 - 80
  • [25] Few-shot driver identification via meta-learning
    Lu, Lin
    Xiong, Shengwu
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 203
  • [26] Few-Shot High-Resolution Range Profile Ship Target Recognition Based on Task-Specific Meta-Learning with Mixed Training and Meta Embedding
    Kong, Yingying
    Zhang, Yuxuan
    Peng, Xiangyang
    Leung, Henry
    REMOTE SENSING, 2023, 15 (22)
  • [27] Few-shot class incremental learning via prompt transfer and knowledge distillation
    Akmel, Feidu
    Meng, Fanman
    Liu, Mingyu
    Zhang, Runtong
    Teka, Asebe
    Lemuye, Elias
    IMAGE AND VISION COMPUTING, 2024, 151
  • [28] Few-Shot Class-Incremental Learning via Relation Knowledge Distillation
    Dong, Songlin
    Hong, Xiaopeng
    Tao, Xiaoyu
    Chang, Xinyuan
    Wei, Xing
    Gong, Yihong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1255 - 1263
  • [29] Few-Shot Image Classification via Mutual Distillation
    Zhang, Tianshu
    Dai, Wenwen
    Chen, Zhiyu
    Yang, Sai
    Liu, Fan
    Zheng, Hao
    APPLIED SCIENCES-BASEL, 2023, 13 (24):
  • [30] Hierarchical Knowledge Propagation and Distillation for Few-Shot Learning
    Zhou, Chunpeng
    Wang, Haishuai
    Zhou, Sheng
    Yu, Zhi
    Bandara, Danushka
    Bu, Jiajun
    NEURAL NETWORKS, 2023, 167 : 615 - 625