Towards Zero-Shot Knowledge Distillation for Natural Language Processing

被引:0
|
作者
Rashid, Ahmad [1 ]
Lioutas, Vasileios [2 ]
Ghaddar, Abbas [1 ]
Rezagholizadeh, Mehdi [1 ]
机构
[1] Huawei Noahs Ark Lab, Montreal, PQ, Canada
[2] Univ British Columbia, Vancouver, BC, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation (KD) is a common knowledge transfer algorithm used for model compression across a variety of deep learning based natural language processing (NLP) solutions. In its regular manifestations, KD requires access to the teacher's training data for knowledge transfer to the student network. However, privacy concerns, data regulations and proprietary reasons may prevent access to such data. We present, to the best of our knowledge, the first work on Zero-shot Knowledge Distillation for NLP, where the student learns from the much larger teacher without any task specific data. Our solution combines out-of-domain data and adversarial training to learn the teacher's output distribution. We investigate six tasks from the GLUE benchmark and demonstrate that we can achieve between 75% and 92% of the teacher's classification score (accuracy or F1) while compressing the model 30 times.
引用
收藏
页码:6551 / 6561
页数:11
相关论文
共 50 条
  • [1] Zero-Shot Knowledge Distillation in Deep Networks
    Nayak, Gaurav Kumar
    Mopuri, Konda Reddy
    Shaj, Vaisakh
    Babu, R. Venkatesh
    Chakraborty, Anirban
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [2] Towards Zero-shot Language Modeling
    Ponti, Edoardo M.
    Vulic, Ivan
    Cotterell, Ryan
    Reichart, Roi
    Korhonen, Anna
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2900 - +
  • [3] Towards Zero-Shot Sign Language Recognition
    Bilge, Yunus Can
    Cinbis, Ramazan Gokberk
    Ikizler-Cinbis, Nazli
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 1217 - 1232
  • [4] Zero-shot Natural Language Video Localization
    Nam, Jinwoo
    Ahn, Daechul
    Kang, Dongyeop
    Ha, Seong Jong
    Choi, Jonghyun
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1450 - 1459
  • [5] Knowledge Distillation Classifier Generation Network for Zero-Shot Learning
    Yu, Yunlong
    Li, Bin
    Ji, Zhong
    Han, Jungong
    Zhang, Zhongfei
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (06) : 3183 - 3194
  • [6] Attribute Distillation for Zero-Shot Recognition
    Li, Houjun
    Wei, Boquan
    [J]. Computer Engineering and Applications, 2024, 60 (09) : 219 - 227
  • [7] A Lightweight Framework With Knowledge Distillation for Zero-Shot Mars Scene Classification
    Tan, Xiaomeng
    Xi, Bobo
    Xu, Haitao
    Li, Jiaojiao
    Li, Yunsong
    Xue, Changbin
    Chanussot, Jocelyn
    [J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62
  • [8] Zero-Shot Grounding of Objects from Natural Language Queries
    Sadhu, Arka
    Chen, Kan
    Nevatia, Ram
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4693 - 4702
  • [9] Zero-Shot Reward Specification via Grounded Natural Language
    Mahmoudieh, Parsa
    Pathak, Deepak
    Darrell, Trevor
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [10] Zero-shot Learning of Classifiers from Natural Language Quantification
    Srivastava, Shashank
    Labutov, Igor
    Mitchell, Tom
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 306 - 316