Sliding Cross Entropy for Self-Knowledge Distillation

被引：3

作者：

Lee, Hanbeen ^{[1
]}

Kim, Jeongho ^{[1
]}

Woo, Simon S. ^{[2
]}

机构：

[1] Sungkyunkwan Univ, Dept Artificial Intelligence, Suwon, South Korea

[2] Sungkyunkwan Univ, Coll Comp & Informat, Suwon, South Korea

来源：

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022 | 2022年

基金：

新加坡国家研究基金会;

关键词：

Representation Learning; Knowledge Distillation; Computer Vision;

D O I：

10.1145/3511808.3557453

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Knowledge distillation (KD) is a powerful technique for improving the performance of a small model by leveraging the knowledge of a larger model. Despite its remarkable performance boost, KD has a drawback with the substantial computational cost of pre-training larger models in advance. Recently, a method called self-knowledge distillation has emerged to improve the model's performance without any supervision. In this paper, we present a novel plug-in approach called Sliding Cross Entropy (SCE) method, which can be combined with existing self-knowledge distillation to significantly improve the performance. Specifically, to minimize the difference between the output of the model and the soft target obtained by self-distillation, we split each softmax representation by a certain window size, and reduce the distance between sliced parts. Through this approach, the model evenly considers all the inter-class relationships of a soft target during optimization. The extensive experiments show that our approach is effective in various tasks, including classification, object detection, and semantic segmentation. We also demonstrate SCE consistently outperforms existing baseline methods.

引用

页码：1044 / 1053

页数：10

共 50 条

[21] Teaching Yourself: A Self-Knowledge Distillation Approach to Action Recognition
Duc-Quang Vu
Le, Ngan
Wang, Jia-Ching
IEEE ACCESS, 2021, 9 : 105711 - 105723
[22] Two-Stage Approach for Targeted Knowledge Transfer in Self-Knowledge Distillation
Yin, Zimo
Pu, Jian
Zhou, Yijie
Xue, Xiangyang
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (11) : 2270 - 2283
[23] From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels
Yang, Zhendong
Zeng, Ailing
Li, Zhe
Zhang, Tianke
Yuan, Chun
Li, Yu
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17139 - 17148
[24] Two-Stage Approach for Targeted Knowledge Transfer in Self-Knowledge Distillation
Zimo Yin
Jian Pu
Yijie Zhou
Xiangyang Xue
IEEE/CAA Journal of Automatica Sinica, 2024, 11 (11) : 2270 - 2283
[25] Self-knowledge distillation based on knowledge transfer from soft to hard examples
Tang, Yuan
Chen, Ying
Xie, Linbo
IMAGE AND VISION COMPUTING, 2023, 135
[26] The Self and Self-Knowledge
Pasquali, Alessia
Belleri, Delia
PHILOSOPHICAL INQUIRIES, 2015, 3 (02): : R1 - R6
[27] Self-knowledge and the self
Lagae, E
TIJDSCHRIFT VOOR FILOSOFIE, 2001, 63 (03): : 623 - 625
[28] The Self and Self-Knowledge
Edwards, Sophie
EUROPEAN JOURNAL OF PHILOSOPHY, 2013, 21 : e1 - e7
[29] Self-Knowledge
Small, Will
MIND, 2013, 122 (488) : 1091 - 1095
[30] Self-Knowledge
Lalumera, Elisabetta
ANALYSIS, 2012, 72 (03) : 619 - 620

← 1 2 3 4 5 →