Knowledge Distillation-Based Representation Learning for Short-Utterance Spoken Language Identification

被引:16
|
作者
Shen, Peng [1 ]
Lu, Xugang [1 ]
Li, Sheng [1 ]
Kawai, Hisashi [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Koganei, Tokyo, Japan
关键词
Task analysis; Speech processing; Training; Speech recognition; Neural networks; Feature extraction; Robustness; Internal representation learning; knowledge distillation; short utterances; spoken language identification;
D O I
10.1109/TASLP.2020.3023627
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
With successful applications of deep feature learning algorithms, spoken language identification (LID) on long utterances obtains satisfactory performance. However, the performance on short utterances is drastically degraded even when the LID system is trained using short utterances. The main reason is due to the large variation of the representation on short utterances which results in high model confusion. To narrow the performance gap between long, and short utterances, we proposed a teacher-student representation learning framework based on a knowledge distillation method to improve LID performance on short utterances. In the proposed framework, in addition to training the student model on short utterances with their true labels, the internal representation from the output of a hidden layer of the student model is supervised with the representation corresponding to their longer utterances. By reducing the distance of internal representations between short, and long utterances, the student model can explore robust discriminative representations for short utterances, which is expected to reduce model confusion. We conducted experiments on our in-house LID dataset, and NIST LRE07 dataset, and showed the effectiveness of the proposed methods for short utterance LID tasks.
引用
收藏
页码:2674 / 2683
页数:10
相关论文
共 50 条
  • [31] FedRCIL: Federated Knowledge Distillation for Representation based Contrastive Incremental Learning
    Psaltis, Athanasios
    Chatzikonstantinou, Christos
    Patrikakis, Charalampos Z.
    Daras, Petros
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 3455 - 3464
  • [32] DiffSLU: Knowledge Distillation Based Diffusion Model for Cross-Lingual Spoken Language Understanding
    Mao, Tianjun
    Zhang, Chenghong
    INTERSPEECH 2023, 2023, : 715 - 719
  • [33] Efficient Vehicle Selection and Resource Allocation for Knowledge Distillation-Based Federated Learning in UAV-Assisted VEC
    Li, Chunlin
    Zhang, Yong
    Yu, Long
    Yang, Mengjie
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025,
  • [34] Spoken Language Identification Using Deep Learning
    Singh, Gundeep
    Sharma, Sahil
    Kumar, Vijay
    Kaur, Manjit
    Baz, Mohammed
    Masud, Mehedi
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [35] A Knowledge Distillation-Based Transportation System for Sensory Data Sharing Using LoRa
    Kumari, Preti
    Mishra, Rahul
    Gupta, Hari Prabhat
    IEEE SENSORS JOURNAL, 2021, 21 (22) : 25315 - 25322
  • [36] Knowledge distillation-based performance transferring for LSTM-RNN model acceleration
    Ma, Hongbin
    Yang, Shuyuan
    Wu, Ruowu
    Hao, Xiaojun
    Long, Huimin
    He, Guangjun
    SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (06) : 1541 - 1548
  • [37] KD-INR: Time-Varying Volumetric Data Compression via Knowledge Distillation-Based Implicit Neural Representation
    Han, Jun
    Zheng, Hao
    Bi, Chongke
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (10) : 6826 - 6838
  • [38] Knowledge Distillation-Based Robust UAV Swarm Communication Under Malicious Attacks
    Wu, Qirui
    Zhang, Yirun
    Yang, Zhaohui
    Shikh-Bahaei, Mohammad
    2024 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS, ICC WORKSHOPS 2024, 2024, : 1023 - 1029
  • [39] Facial landmark points detection using knowledge distillation-based neural networks
    Fard, Ali Pourramezan
    Mahoor, Mohammad H.
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 215
  • [40] Knowledge distillation-based performance transferring for LSTM-RNN model acceleration
    Hongbin Ma
    Shuyuan Yang
    Ruowu Wu
    Xiaojun Hao
    Huimin Long
    Guangjun He
    Signal, Image and Video Processing, 2022, 16 : 1541 - 1548