TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing

被引:0
|
作者
Yang, Ziqing [1 ]
Cui, Yiming [1 ,2 ]
Chen, Zhipeng [1 ]
Che, Wanxiang [2 ]
Liu, Ting [2 ]
Wang, Shijin [1 ,3 ]
Hu, Guoping [1 ]
机构
[1] iFLYTEK Res, State Key Lab Cognit Intelligence, Hefei, Anhui, Peoples R China
[2] Harbin Inst Technol, Res Ctr Social Comp & Informat Retrieval SCIR, Harbin, Peoples R China
[3] iFLYTEK AI Res Hebei, Langfang, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we introduce TextBrewer, an open-source knowledge distillation toolkit designed for natural language processing. It works with different neural network models and supports various kinds of supervised learning tasks, such as text classification, reading comprehension, sequence labeling. TextBrewer provides a simple and uniform workflow that enables quick setting up of distillation experiments with highly flexible configurations. It offers a set of predefined distillation methods and can be extended with custom code. As a case study, we use TextBrewer to distill BERT on several typical NLP tasks. With simple configurations, we achieve results that are comparable with or even higher than the public distilled BERT models with similar numbers of parameters.(1)
引用
收藏
页码:9 / 16
页数:8
相关论文
共 50 条
  • [41] MMDAGENT - A FULLY OPEN-SOURCE TOOLKIT FOR VOICE INTERACTION SYSTEMS
    Lee, Akinobu
    Oura, Keiichiro
    Tokuda, Keiichi
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8382 - 8385
  • [42] InaNLP: Indonesia Natural Language Processing Toolkit
    Purwarianti, Ayu
    Andhika, Alvin
    Wicaksono, Alfan Farizki
    Afif, Irfan
    Ferdian, Filman
    [J]. 2016 INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS - CONCEPTS, THEORY AND APPLICATION (ICAICTA), 2016,
  • [43] An open-source radiotherapy image registration toolkit integrated with CERR
    Wu, Y.
    Yang, D.
    Khullar, D.
    El Naqa, I.
    Deasy, J.
    [J]. MEDICAL PHYSICS, 2007, 34 (06) : 2397 - 2397
  • [44] CircadiPy: An open-source toolkit for analyzing chronobiology time series
    Carvalho-Moreira, Joao Pedro
    Guarnieri, Leonardo de Oliveira
    Passos, Matheus Costa
    Emrich, Felipe
    Bargi-Souza, Paula
    Peliciari-Garcia, Rodrigo Antonio
    Moraes, Marcio Flavio Dutra
    [J]. JOURNAL OF NEUROSCIENCE METHODS, 2024, 411
  • [45] SHARPpy An Open-Source Sounding Analysis Toolkit for the Atmospheric Sciences
    Blumberg, William G.
    Halbert, Kelton T.
    Supinie, Timothy A.
    Marshsh, Patrick T.
    Thompson, Richard L.
    Hart, Johohn A.
    [J]. BULLETIN OF THE AMERICAN METEOROLOGICAL SOCIETY, 2017, 98 (08) : 1625 - 1636
  • [46] Vulnerability modellers toolkit, an open-source platform for vulnerability analysis
    Martins, Luis
    Silva, Vitor
    Crowley, Helen
    Cavalieri, Francesco
    [J]. BULLETIN OF EARTHQUAKE ENGINEERING, 2021, 19 (13) : 5691 - 5709
  • [47] Imago: Open-source toolkit for chemical structure image recognition
    Chutkov, Rostislav
    Rybalkin, Michael
    Smolov, Victor
    Andrea, Kliton
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 244
  • [48] IceNLP: A Natural Language Processing Toolkit for Icelandic
    Loftsson, Hrafn
    Rognvaldsson, Eirikur
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 717 - +
  • [49] An open-source toolkit for the volumetric measurement of CT lung lesions
    Krishnan, Karthik
    Ibanez, Luis
    Turner, Wesley D.
    Jomier, Julien
    Avila, Ricardo S.
    [J]. OPTICS EXPRESS, 2010, 18 (14): : 15256 - 15266
  • [50] UER: An Open-Source Toolkit for Pre-training Models
    Zhao, Zhe
    Chen, Hui
    Zhang, Jinbin
    Zhao, Xin
    Liu, Tao
    Lu, Wei
    Chen, Xi
    Deng, Haotang
    Ju, Qi
    Du, Xiaoyong
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2019, : 241 - 246