A Survey of Knowledge Distillation in Deep Learning

被引:0
|
作者
Shao R.-R. [1 ]
Liu Y.-A. [1 ]
Zhang W. [1 ]
Wang J. [1 ]
机构
[1] School of Computer Science and Technology, East China Normal University, Shanghai
来源
关键词
Artificial intelligence; Deep neural network; Knowledge distillation; Model compression; Transfer learning;
D O I
10.11897/SP.J.1016.2022.01638
中图分类号
学科分类号
摘要
With the rapid development of artificial intelligence, deep neural networks are widely used in various research fields and have achieved great success, but they also face a lot of challenges. First of all, to solve more complex problems and improve the training effect of the model, the network structure of the model is gradually designed to be deep and complex, and it is difficult to adapt to the development of mobile computing for low resources and low power consumption. Knowledge distillation was originally used for model compression as a learning paradigm that transfers knowledge from a large teacher model to a compact student model and improves performance. However, with the development of knowledge distillation, its teacher-student architecture, as a special transfer learning method, has evolved a rich variety of variants and architectures, and has been gradually extended to various deep learning tasks and scenarios, including computers vision, natural language processing, recommendation systems, etc. In addition, through the learning method of transferring knowledge between neural network models, cross-modal or cross-domain learning tasks can be connected to avoid knowledge forgetting; it can also achieve the separation of models and data to achieve the purpose of protecting private data. Knowledge distillation is playing an increasingly important role in various fields of artificial intelligence, and it is a universal means to solve many practical problems. This paper sorts out the main references of knowledge distillation, elaborates the learning framework of knowledge distillation, compares and analyzes the related work of knowledge distillation from a variety of classification perspectives, introduces the main application scenarios, and finally discusses the future development trends and provides insights. © 2022, Science Press. All right reserved.
引用
收藏
页码:1638 / 1673
页数:35
相关论文
共 50 条
  • [21] Deep Cross-Layer Collaborative Learning Network for Online Knowledge Distillation
    Su, Tongtong
    Liang, Qiyu
    Zhang, Jinsong
    Yu, Zhaoyang
    Xu, Ziyue
    Wang, Gang
    Liu, Xiaoguang
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2075 - 2087
  • [22] A survey on knowledge distillation: Recent advancements
    Moslemi, Amir
    Briskina, Anna
    Dang, Zubeka
    Li, Jason
    [J]. Machine Learning with Applications, 2024, 18
  • [23] Deep knowledge distillation: A self-mutual learning framework for traffic prediction
    Li, Ying
    Li, Ping
    Yan, Doudou
    Liu, Yang
    Liu, Zhiyuan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 252
  • [24] Deep Batch Active Learning and Knowledge Distillation for Person RA-Identification
    Hu, Zhentao
    Hou, Wei
    Liu, Xianxing
    [J]. IEEE SENSORS JOURNAL, 2022, 22 (14) : 14347 - 14355
  • [25] Online probabilistic knowledge distillation on cryptocurrency trading using Deep Reinforcement Learning
    Moustakidis, Vasileios
    Passalis, Nikolaos
    Tefas, Anastasios
    [J]. Pattern Recognition Letters, 2024, 186 : 243 - 249
  • [26] DEEP GEOMETRIC KNOWLEDGE DISTILLATION WITH GRAPHS
    Lassance, Carlos
    Bontonou, Myriam
    Hacene, Ghouthi Boukli
    Gripon, Vincent
    Tang, Jian
    Ortega, Antonio
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8484 - 8488
  • [27] Knowledge-Augmented Deep Learning and Its Applications: A Survey
    Cui, Zijun
    Gao, Tian
    Talamadupula, Kartik
    Ji, Qiang
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 21
  • [28] GrOD: Deep Learning with Gradients Orthogonal Decomposition for Knowledge Transfer, Distillation, and Adversarial Training
    Xiong, Haoyi
    Wan, Ruosi
    Zhao, Jian
    Chen, Zeyu
    Li, Xingjian
    Zhu, Zhanxing
    Huan, Jun
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (06)
  • [29] Knowledge distillation-based deep learning classification network for peripheral blood leukocytes
    Leng, Bing
    Leng, Min
    Ge, Mingfeng
    Dong, Wenfei
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 75
  • [30] Diversity-driven knowledge distillation for financial trading using Deep Reinforcement Learning
    Tsantekidis, Avraam
    Passalis, Nikolaos
    Tefas, Anastasios
    [J]. NEURAL NETWORKS, 2021, 140 : 193 - 202