TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing

被引:0
|
作者
Yang, Ziqing [1 ]
Cui, Yiming [1 ,2 ]
Chen, Zhipeng [1 ]
Che, Wanxiang [2 ]
Liu, Ting [2 ]
Wang, Shijin [1 ,3 ]
Hu, Guoping [1 ]
机构
[1] iFLYTEK Res, State Key Lab Cognit Intelligence, Hefei, Anhui, Peoples R China
[2] Harbin Inst Technol, Res Ctr Social Comp & Informat Retrieval SCIR, Harbin, Peoples R China
[3] iFLYTEK AI Res Hebei, Langfang, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we introduce TextBrewer, an open-source knowledge distillation toolkit designed for natural language processing. It works with different neural network models and supports various kinds of supervised learning tasks, such as text classification, reading comprehension, sequence labeling. TextBrewer provides a simple and uniform workflow that enables quick setting up of distillation experiments with highly flexible configurations. It offers a set of predefined distillation methods and can be extended with custom code. As a case study, we use TextBrewer to distill BERT on several typical NLP tasks. With simple configurations, we achieve results that are comparable with or even higher than the public distilled BERT models with similar numbers of parameters.(1)
引用
收藏
页码:9 / 16
页数:8
相关论文
共 50 条
  • [21] THE BAVIECA OPEN-SOURCE SPEECH RECOGNITION TOOLKIT
    Bolanos, Daniel
    [J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 354 - 359
  • [22] ExMove: An open-source toolkit for processing and exploring animal-tracking data in R
    Langley, Liam P.
    Lang, Stephen D. J.
    Ozsanlav-Harris, Luke
    Trevail, Alice M.
    [J]. JOURNAL OF ANIMAL ECOLOGY, 2024, 93 (07) : 784 - 795
  • [23] NimbleMiner An Open-Source Nursing-Sensitive Natural Language Processing System Based on Word Embedding
    Topaz, Maxim
    Murga, Ludmila
    Bar-Bachar, Ofrit
    McDonald, Margaret
    Bowles, Kathryn
    [J]. CIN-COMPUTERS INFORMATICS NURSING, 2019, 37 (11) : 583 - 590
  • [24] AI-assisted clinical trial recruitment using an open-source natural language processing workflow
    Kavnoudias, Helen
    Berry, Christopher
    Christian, Theo
    McKimm, Amy
    MacBean, Lachlan
    Zia, Adil
    Morris, Adam
    Librata, William
    Buensalido, Dominic
    Batstone, Joanna
    Law, Meng
    Woollett, Anne
    Jane, Stephen
    Teede, Helena
    [J]. ASIA-PACIFIC JOURNAL OF CLINICAL ONCOLOGY, 2022, 18 : 146 - 146
  • [25] Survey of Open Source Natural Language Processing Tools
    Liao, Chunlin
    Zhang, Hongjun
    Liao, Xianglin
    Cheng, Kai
    Li, Dashuo
    Wang, Hang
    [J]. Computer Engineering and Applications, 2023, 59 (22) : 36 - 56
  • [26] FreeLing 2.1: Five years of open-source language processing tools
    Padro, Lluis
    Collado, Miquel
    Reese, Samuel
    Lloberes, Marina
    Castellon, Irene
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010,
  • [27] QuanEstimation: An open-source toolkit for quantum parameter estimation
    Zhang, Mao
    Yu, Huai-Ming
    Yuan, Haidong
    Wang, Xiaoguang
    Demkowicz-Dobrzanski, Rafal
    Liu, Jing
    [J]. PHYSICAL REVIEW RESEARCH, 2022, 4 (04):
  • [28] GDP: an open-source GNSS data preprocessing toolkit
    Chen, Zhengsheng
    Cui, Yang
    Li, Linyang
    Zhang, Qinghua
    Lu, Zhiping
    Li, Xuerui
    Kuang, Yingcai
    Yang, Kaichun
    Rong, Fengjuan
    [J]. GPS SOLUTIONS, 2020, 24 (03)
  • [29] OpenAttack: An Open-source Textual Adversarial Attack Toolkit
    Zeng, Guoyang
    Qi, Fanchao
    Zhou, Qianrui
    Zhang, Tingji
    Ma, Zixian
    Hou, Bairu
    Zang, Yuan
    Liu, Zhiyuan
    Sun, Maosong
    [J]. ACL-IJCNLP 2021: THE JOINT CONFERENCE OF THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE SYSTEM DEMONSTRATIONS, 2021, : 363 - 371
  • [30] μDIC: An open-source toolkit for digital image correlation
    Olufsen, Sindre Nordmark
    Andersen, Marius Endre
    Fagerholt, Egil
    [J]. SOFTWAREX, 2020, 11