Strategies to Improve a Speaker Diarisation Tool

被引：0

作者：

Tavarez, David ^{[1
]}

Navas, Eva ^{[1
]}

Erro, Daniel ^{[1
]}

Saratxaga, Ibon ^{[1
]}

机构：

[1] Univ Basque Country, Aholab, Dept Elect & Telecommun, Fac Engn, Bilbao 48013, Spain

来源：

LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2012年

关键词：

Speaker Diarisation; Speaker Clustering; Evaluation;

D O I：

暂无

中图分类号：

H0 [语言学];

学科分类号：

030303 ; 0501 ; 050102 ;

摘要：

This paper describes the different strategies used to improve the results obtained by our off-line speaker diarisation tool with the Albayzin 2010 diarisation database. The errors made by the system have been analyzed and different strategies have been proposed to reduce each kind of error. Very short segments incorrectly labelled and different appearances of one speaker labelled with different identifiers are the most common errors. A post-processing module that refines the segmentation by retraining the GMM models of the speakers involved has been built to cope with these errors. This post-processing module has been tuned with the training dataset and improves the result of the diarisation system by 16.4% in the test dataset.

引用

页码：4117 / 4121

页数：5

共 50 条

[1] Adapting Speaker Embeddings for Speaker Diarisation
Kwon, Youngki
Jung, Jee-weon
Heo, Hee-Soo
Kim, You Jin
Lee, Bong-Jin
Chung, Joon Son
[J]. INTERSPEECH 2021, 2021, : 3101 - 3105
[2] CONTENT-AWARE SPEAKER EMBEDDINGS FOR SPEAKER DIARISATION
Sun, G.
Liu, D.
Zhang, C.
Woodland, P. C.
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7168 - 7172
[3] Two-way cluster voting to improve speaker diarisation performance
Tranter, SE
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 753 - 756
[4] Combination of deep speaker embeddings for diarisation
Sun, Guangzhi
Zhang, Chao
Woodland, Philip C.
[J]. NEURAL NETWORKS, 2021, 141 : 372 - 384
[5] DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS
Milner, Rosanna
Hain, Thomas
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4925 - 4929
[6] Speaker overlap detection with prosodic features for speaker diarisation
Zelenak, M.
Hernando, J.
[J]. IET SIGNAL PROCESSING, 2012, 6 (08) : 798 - 804
[7] DNN-based speaker clustering for speaker diarisation
Milner, Rosanna
Hain, Thomas
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2185 - 2189
[8] DISCRIMINATIVE NEURAL CLUSTERING FOR SPEAKER DIARISATION
Li, Qiujia
Kreyssig, Florian L.
Zhang, Chao
Woodland, Philip C.
[J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 574 - 581
[9] Spot the conversation: speaker diarisation in the wild
Chung, Joon Son
Huh, Jaesung
Nagrani, Arsha
Afouras, Triantafyllos
Zisserman, Andrew
[J]. INTERSPEECH 2020, 2020, : 299 - 303
[10] Audio-Visual Synchronisation for Speaker Diarisation
Garau, Giulia
Dielmann, Alfred
Bourlard, Herve
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2662 - +

← 1 2 3 4 5 →