Knowledge Distillation-Based Representation Learning for Short-Utterance Spoken Language Identification

被引:16
|
作者
Shen, Peng [1 ]
Lu, Xugang [1 ]
Li, Sheng [1 ]
Kawai, Hisashi [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Koganei, Tokyo, Japan
关键词
Task analysis; Speech processing; Training; Speech recognition; Neural networks; Feature extraction; Robustness; Internal representation learning; knowledge distillation; short utterances; spoken language identification;
D O I
10.1109/TASLP.2020.3023627
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
With successful applications of deep feature learning algorithms, spoken language identification (LID) on long utterances obtains satisfactory performance. However, the performance on short utterances is drastically degraded even when the LID system is trained using short utterances. The main reason is due to the large variation of the representation on short utterances which results in high model confusion. To narrow the performance gap between long, and short utterances, we proposed a teacher-student representation learning framework based on a knowledge distillation method to improve LID performance on short utterances. In the proposed framework, in addition to training the student model on short utterances with their true labels, the internal representation from the output of a hidden layer of the student model is supervised with the representation corresponding to their longer utterances. By reducing the distance of internal representations between short, and long utterances, the student model can explore robust discriminative representations for short utterances, which is expected to reduce model confusion. We conducted experiments on our in-house LID dataset, and NIST LRE07 dataset, and showed the effectiveness of the proposed methods for short utterance LID tasks.
引用
收藏
页码:2674 / 2683
页数:10
相关论文
共 50 条
  • [1] Feature Representation of Short Utterances based on Knowledge Distillation for Spoken Language Identification
    Shen, Peng
    Lu, Xugang
    Li, Sheng
    Kawai, Hisashi
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1813 - 1817
  • [2] Knowledge Distillation-Based Domain-Invariant Representation Learning for Domain Generalization
    Niu, Ziwei
    Yuan, Junkun
    Ma, Xu
    Xu, Yingying
    Liu, Jing
    Chen, Yen-Wei
    Tong, Ruofeng
    Lin, Lanfen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 245 - 255
  • [3] INTERACTIVE LEARNING OF TEACHER-STUDENT MODEL FOR SHORT UTTERANCE SPOKEN LANGUAGE IDENTIFICATION
    Shen, Peng
    Lu, Xugang
    Li, Sheng
    Kawai, Hisashi
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5981 - 5985
  • [4] Conformer-based Language Embedding with Self-Knowledge Distillation for Spoken Language Identification
    Wang, Feng
    Huang, Lingyan
    Li, Tao
    Hong, Qingyang
    Li, Lin
    INTERSPEECH 2023, 2023, : 5286 - 5290
  • [5] Knowledge Distillation-based Learning Model Propagation for Urban Air Mobility
    Xiong, Kai
    Xie, Juefei
    Wang, Zhihong
    Leng, Supeng
    2024 IEEE 99TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2024-SPRING, 2024,
  • [6] FedMEKT: Distillation-based embedding knowledge transfer for multimodal federated learning
    Le, Huy Q.
    Nguyen, Minh N. H.
    Thwal, Chu Myaet
    Qiao, Yu
    Zhang, Chaoning
    Hong, Choong Seon
    NEURAL NETWORKS, 2025, 183
  • [7] Continual Learning Based on Knowledge Distillation and Representation Learning
    Chen, Xiu-Yan
    Liu, Jian-Wei
    Li, Wen-Tao
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT IV, 2022, 13532 : 27 - 38
  • [8] Knowledge distillation-based deep learning classification network for peripheral blood leukocytes
    Leng, Bing
    Leng, Min
    Ge, Mingfeng
    Dong, Wenfei
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 75
  • [9] Knowledge Distillation-Based Zero-Shot Learning for Process Fault Diagnosis
    Liu, Yi
    Huang, Jiajun
    Jia, Mingwei
    ADVANCED INTELLIGENT SYSTEMS, 2024,
  • [10] An Investigation of the Combination of Rehearsal and Knowledge Distillation in Continual Learning for Spoken Language Understanding
    Cappellazzo, Umberto
    Falavigna, Daniele
    Brutti, Alessio
    INTERSPEECH 2023, 2023, : 735 - 739