Self-Distillation into Self-Attention Heads for Improving Transformer-based End-to-End Neural Speaker Diarization

被引:1
|
作者
Jeoung, Ye-Rin [1 ]
Choi, Jeong-Hwan [1 ]
Seong, Ju-Seok [1 ]
Kyung, JeHyun [1 ]
Chang, Joon-Hyuk [1 ]
机构
[1] Hanyang Univ, Dept Elect Engn, Seoul, South Korea
来源
关键词
speaker diarization; end-to-end neural diarization; self-attention mechanism; fine-tuning; self-distillation;
D O I
10.21437/Interspeech.2023-1404
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this study, we explore self-distillation (SD) techniques to improve the performance of the transformer-encoder-based selfattentive (SA) end-to-end neural speaker diarization (EEND). We first apply the SD approaches, introduced in the automatic speech recognition field, to the SA-EEND model to confirm their potential for speaker diarization. Then, we propose two novel SD methods for the SA-EEND, which distill the prediction output of the model or the SA heads of the upper blocks into the SA heads of the lower blocks. Consequently, we expect the high-level speaker-discriminative knowledge learned by the upper blocks to be shared across the lower blocks, thereby enabling the SA heads of the lower blocks to effectively capture the discriminative patterns of overlapped speech of multiple speakers. Experimental results on the simulated and CALL-HOME datasets show that the SD generally improves the baseline performance, and the proposed methods outperform the conventional SD approaches.
引用
收藏
页码:3197 / 3201
页数:5
相关论文
共 50 条
  • [41] Improving Hybrid CTC/Attention Architecture with Time-Restricted Self-Attention CTC for End-to-End Speech Recognition
    Wu, Long
    Li, Ta
    Wang, Li
    Yan, Yonghong
    APPLIED SCIENCES-BASEL, 2019, 9 (21):
  • [42] Reinforcement-Tracking: An End-to-End Trajectory Tracking Method Based on Self-Attention Mechanism
    Zhao, Guanglei
    Chen, Zihao
    Liao, Weiming
    INTERNATIONAL JOURNAL OF AUTOMOTIVE TECHNOLOGY, 2024, 25 (03) : 541 - 551
  • [43] Reinforcement-Tracking: An End-to-End Trajectory Tracking Method Based on Self-Attention Mechanism
    Guanglei Zhao
    Zihao Chen
    Weiming Liao
    International Journal of Automotive Technology, 2024, 25 : 541 - 551
  • [44] IMPROVED END-TO-END SPOKEN UTTERANCE CLASSIFICATION WITH A SELF-ATTENTION ACOUSTIC CLASSIFIER
    Price, Ryan
    Mehrabani, Mahnoosh
    Srinivas
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8504 - 8508
  • [45] Application of an end-to-end model with self-attention mechanism in cardiac disease prediction
    Li, Li
    Chen, Xi
    Hu, Sanjun
    FRONTIERS IN PHYSIOLOGY, 2024, 14
  • [46] SELF-ATTENTION ALIGNER: A LATENCY-CONTROL END-TO-END MODEL FOR ASR USING SELF-ATTENTION NETWORK AND CHUNK-HOPPING
    Dong, Linhao
    Wang, Feng
    Xu, Bo
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5656 - 5660
  • [47] Improving End-to-End Neural Diarization Using Conversational Summary Representations
    Broughton, Samuel J.
    Samarakoon, Lahiru
    INTERSPEECH 2023, 2023, : 3157 - 3161
  • [48] Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR
    Maekaku, Takashi
    Fujita, Yuya
    Peng, Yifan
    Watanabe, Shinji
    INTERSPEECH 2022, 2022, : 1071 - 1075
  • [49] Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR
    Maekaku, Takashi
    Fujita, Yuya
    Peng, Yifan
    Watanabe, Shinji
    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2022, 2022-September : 1071 - 1075
  • [50] Transformer-based end-to-end scene text recognition
    Zhu, Xinghao
    Zhang, Zhi
    PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 1691 - 1695