KNOWLEDGE DISTILLATION FROM BERT IN PRE-TRAINING AND FINE-TUNING FOR POLYPHONE DISAMBIGUATION

被引:0
|
作者
Sun, Hao [1 ]
Tan, Xu [2 ]
Gan, Jun-Wei [3 ]
Zhao, Sheng [3 ]
Han, Dongxu [3 ]
Liu, Hongzhi [1 ]
Qin, Tao [2 ]
Liu, Tie-Yan [2 ]
机构
[1] Peking Univ, Beijing, Peoples R China
[2] Microsoft Res, Beijing, Peoples R China
[3] Microsoft STC Asia, Beijing, Peoples R China
关键词
Polyphone Disambiguation; Knowledge Distillation; Pre-training; Fine-tuning; BERT;
D O I
10.1109/asru46091.2019.9003918
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Polyphone disambiguation aims to select the correct pronunciation for a polyphonic word from several candidates, which is important for text-to-speech synthesis. Since the pronunciation of a polyphonic word is usually decided by its context, polyphone disambiguation can be regarded as a language understanding task. Inspired by the success of BERT for language understanding, we propose to leverage pre-trained BERT models for polyphone disambiguation. However, BERT models are usually too heavy to be served online, in terms of both memory cost and inference speed. In this work, we focus on efficient model for polyphone disambiguation and propose a two-stage knowledge distillation method that transfers the knowledge from a heavy BERT model in both pre-training and fine-tuning stages to a lightweight BERT model, in order to reduce online serving cost. Experiments on Chinese and English polyphone disambiguation datasets demonstrate that our method reduces model parameters by a factor of 5 and improves inference speed by 7 times, while nearly matches the classification accuracy (95.4% on Chinese and 98.1% on English) to the original BERT model.
引用
收藏
页码:168 / 175
页数:8
相关论文
共 50 条
  • [21] Pre-training Fine-tuning data Enhancement method based on active learning
    Cao, Deqi
    Ding, Zhaoyun
    Wang, Fei
    Ma, Haoyang
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 1447 - 1454
  • [22] Pre-training and Fine-tuning Neural Topic Model: A Simple yet Effective Approach to Incorporating External Knowledge
    Zhang, Linhai
    Hu, Xumeng
    Wang, Boyu
    Zhou, Deyu
    Zhang, Qian-Wen
    Cao, Yunbo
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5980 - 5989
  • [23] Breaking the Barrier Between Pre-training and Fine-tuning: A Hybrid Prompting Model for Knowledge-Based VQA
    Sun, Zhongfan
    Hu, Yongli
    Gao, Qingqing
    Jiang, Huajie
    Gao, Junbin
    Sun, Yanfeng
    Yin, Baocai
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4065 - 4073
  • [24] Style Attuned Pre-training and Parameter Efficient Fine-tuning for Spoken Language Understanding
    Cao, Jin
    Wang, Jun
    Hamza, Wael
    Vanee, Kelly
    Li, Shang-Wen
    INTERSPEECH 2020, 2020, : 1570 - 1574
  • [25] Few-Shot Intent Detection via Contrastive Pre-Training and Fine-Tuning
    Zhang, Jian-Guo
    Bui, Trung
    Yoon, Seunghyun
    Chen, Xiang
    Liu, Zhiwei
    Xia, Congying
    Tran, Quan Hung
    Chang, Walter
    Yu, Philip
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1906 - 1912
  • [26] From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain
    Bonfigli, Agnese
    Bacco, Luca
    Merone, Mario
    Dell'Orletta, Felice
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 157
  • [27] FOOD IMAGE RECOGNITION USING DEEP CONVOLUTIONAL NETWORK WITH PRE-TRAINING AND FINE-TUNING
    Yanai, Keiji
    Kawano, Yoshiyuki
    2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2015,
  • [28] Robust Face Tracking Using Siamese-VGG with Pre-training and Fine-tuning
    Yuan, Shuo
    Yu, Xinguo
    Majid, Abdul
    2019 4TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS ENGINEERING (ICCRE), 2019, : 170 - 174
  • [29] Improving Pre-Training and Fine-Tuning for Few-Shot SAR Automatic Target Recognition
    Zhang, Chao
    Dong, Hongbin
    Deng, Baosong
    REMOTE SENSING, 2023, 15 (06)
  • [30] MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning
    Rafailov, Rafael
    Hatch, Kyle
    Kolev, Victor
    Martin, John D.
    Phielipp, Mariano
    Finn, Chelsea
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229