Knowledge Distillation by On-the-Fly Native Ensemble

被引:0
|
作者
Lan, Xu [1 ]
Zhu, Xiatian [2 ]
Gong, Shaogang [1 ]
机构
[1] Queen Mary Univ London, London, England
[2] Vis Semant Ltd, London, England
基金
“创新英国”项目;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation is effective to train the small and generalisable network models for meeting the low-memory and fast running requirements. Existing offline distillation methods rely on a strong pre-trained teacher, which enables favourable knowledge discovery and transfer but requires a complex two-phase training procedure. Online counterparts address this limitation at the price of lacking a high-capacity teacher. In this work, we present an On-the-fly Native Ensemble (ONE) learning strategy for one-stage online distillation. Specifically, ONE only trains a single multi-branch network while simultaneously establishing a strong teacher on-the-fly to enhance the learning of target network. Extensive evaluations show that ONE improves the generalisation performance of a variety of deep neural networks more significantly than alternative methods on four image classification dataset: CIFAR10, CIFAR100, SVHN, and ImageNet, whilst having the computational efficiency advantages.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] ON-THE-FLY KNOWLEDGE-BASED REFINEMENT BY A CLASSIFIER SYSTEM
    TERANO, T
    MURO, Z
    [J]. AI COMMUNICATIONS, 1994, 7 (02) : 86 - 97
  • [2] Query-Driven On-The-Fly Knowledge Base Construction
    Dat Ba Nguyen
    Abujabal, Abdalghani
    Nam Khanh Tran
    Theobald, Martin
    Weikum, Gerhard
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 11 (01): : 66 - 79
  • [3] A Novel Neural Ensemble Architecture for On-the-fly Classification of Evolving Text Streams
    Ghahramanian, Pouya
    Bakhshi, Sepehr
    Bonab, Hamed
    Can, Fazli
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (04)
  • [4] Efficient Knowledge Distillation from an Ensemble of Teachers
    Fukuda, Takashi
    Suzuki, Masayuki
    Kurata, Gakuto
    Thomas, Samuel
    Cui, Jia
    Ramabhadran, Bhuvana
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3697 - 3701
  • [5] Hybrid Reasoning Over Large Knowledge Bases Using On-The-Fly Knowledge Extraction
    Stoilos, Giorgos
    Juric, Damir
    Wartak, Szymon
    Schulz, Claudia
    Khodadadi, Mohammad
    [J]. SEMANTIC WEB (ESWC 2020), 2020, 12123 : 69 - 85
  • [6] On-the-fly training
    Melenchón, J
    Meler, L
    Iriondo, I
    [J]. ARTICULATED MOTION AND DEFORMABLE OBJECTS, PROCEEDINGS, 2004, 3179 : 146 - 153
  • [7] "In-Network Ensemble": Deep Ensemble Learning with Diversified Knowledge Distillation
    Li, Xingjian
    Xiong, Haoyi
    Chen, Zeyu
    Huan, Jun
    Xu, Cheng-Zhong
    Dou, Dejing
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (05)
  • [8] NETME: on-the-fly knowledge network construction from biomedical literature
    Alessandro Muscolino
    Antonio Di Maria
    Rosaria Valentina Rapicavoli
    Salvatore Alaimo
    Lorenzo Bellomo
    Fabrizio Billeci
    Stefano Borzì
    Paolo Ferragina
    Alfredo Ferro
    Alfredo Pulvirenti
    [J]. Applied Network Science, 7
  • [9] NETME: on-the-fly knowledge network construction from biomedical literature
    Muscolino, Alessandro
    Di Maria, Antonio
    Rapicavoli, Rosaria Valentina
    Alaimo, Salvatore
    Bellomo, Lorenzo
    Billeci, Fabrizio
    Borzi, Stefano
    Ferragina, Paolo
    Ferro, Alfredo
    Pulvirenti, Alfredo
    [J]. APPLIED NETWORK SCIENCE, 2022, 7 (01)
  • [10] ON-THE-FLY ROUNDING
    ERCEGOVAC, MD
    LANG, T
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 1992, 41 (12) : 1497 - 1503