Strategies to Improve a Speaker Diarisation Tool

被引:0
|
作者
Tavarez, David [1 ]
Navas, Eva [1 ]
Erro, Daniel [1 ]
Saratxaga, Ibon [1 ]
机构
[1] Univ Basque Country, Aholab, Dept Elect & Telecommun, Fac Engn, Bilbao 48013, Spain
关键词
Speaker Diarisation; Speaker Clustering; Evaluation;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper describes the different strategies used to improve the results obtained by our off-line speaker diarisation tool with the Albayzin 2010 diarisation database. The errors made by the system have been analyzed and different strategies have been proposed to reduce each kind of error. Very short segments incorrectly labelled and different appearances of one speaker labelled with different identifiers are the most common errors. A post-processing module that refines the segmentation by retraining the GMM models of the speakers involved has been built to cope with these errors. This post-processing module has been tuned with the training dataset and improves the result of the diarisation system by 16.4% in the test dataset.
引用
收藏
页码:4117 / 4121
页数:5
相关论文
共 50 条
  • [1] Adapting Speaker Embeddings for Speaker Diarisation
    Kwon, Youngki
    Jung, Jee-weon
    Heo, Hee-Soo
    Kim, You Jin
    Lee, Bong-Jin
    Chung, Joon Son
    [J]. INTERSPEECH 2021, 2021, : 3101 - 3105
  • [2] CONTENT-AWARE SPEAKER EMBEDDINGS FOR SPEAKER DIARISATION
    Sun, G.
    Liu, D.
    Zhang, C.
    Woodland, P. C.
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7168 - 7172
  • [3] Two-way cluster voting to improve speaker diarisation performance
    Tranter, SE
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 753 - 756
  • [4] Combination of deep speaker embeddings for diarisation
    Sun, Guangzhi
    Zhang, Chao
    Woodland, Philip C.
    [J]. NEURAL NETWORKS, 2021, 141 : 372 - 384
  • [5] DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS
    Milner, Rosanna
    Hain, Thomas
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4925 - 4929
  • [6] Speaker overlap detection with prosodic features for speaker diarisation
    Zelenak, M.
    Hernando, J.
    [J]. IET SIGNAL PROCESSING, 2012, 6 (08) : 798 - 804
  • [7] DNN-based speaker clustering for speaker diarisation
    Milner, Rosanna
    Hain, Thomas
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2185 - 2189
  • [8] DISCRIMINATIVE NEURAL CLUSTERING FOR SPEAKER DIARISATION
    Li, Qiujia
    Kreyssig, Florian L.
    Zhang, Chao
    Woodland, Philip C.
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 574 - 581
  • [9] Spot the conversation: speaker diarisation in the wild
    Chung, Joon Son
    Huh, Jaesung
    Nagrani, Arsha
    Afouras, Triantafyllos
    Zisserman, Andrew
    [J]. INTERSPEECH 2020, 2020, : 299 - 303
  • [10] Audio-Visual Synchronisation for Speaker Diarisation
    Garau, Giulia
    Dielmann, Alfred
    Bourlard, Herve
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2662 - +