An Iterative Speaker Re-Diarization Scheme for Improving Speaker-Based Entity Extraction in Multimedia Archives

被引：0

作者：

Ghaemmaghami, Houman ^{[1
]}

Dean, David ^{[1
]}

Sridharan, Sridha ^{[1
]}

机构：

[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld, Australia

来源：

15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年

基金：

澳大利亚研究理事会;

关键词：

speaker re-diarization; diarization; speaker linking; complete-linkage clustering; cross-likelihood ratio;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we present a novel scheme for improving speaker diarization by making use of repeating speakers across multiple recordings within a large corpus. We call this technique speaker re-diarization and demonstrate that it is possible to reuse the initial speaker-linked diarization outputs to boost diarization accuracy within individual recordings. We first propose and evaluate two novel re-diarization techniques. We demonstrate their complementary characteristics and fuse the two techniques to successfully conduct speaker re-diarization across the SAIVT-BNEWS corpus of Australian broadcast data. This corpus contains recurring speakers in various independent recordings that need to be linked across the dataset. We show that our speaker re-diarization approach can provide a relative improvement of 23% in diarization error rate (DER), over the original diarization results, as well as improve the estimated number of speakers and the cluster purity and coverage metrics.

引用

页码：577 / 581

页数：5

共 10 条

[1] A SPEAKER REDIARIZATION SCHEME FOR IMPROVING DIARIZATION IN LARGE TWO-SPEAKER TELEPHONE DATASETS
Ghaemmaghami, Houman
Dean, David
Sridharan, Sridha
[J]. 2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1272 - 1276
[2] IMPROVING SEPARATION-BASED SPEAKER DIARIZATION VIA ITERATIVE MODEL REFINEMENT AND SPEAKER EMBEDDING BASED POST-PROCESSING
Niu, Shu-Tong
Du, Jun
Sun, Lei
Lee, Chin-Hui
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8387 - 8391
[3] Diarization-based Speaker Retrieval for Broadcast Television Archives
Huijbregts, Marijn
van Leeuwen, David
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1044 - 1047
[4] Initialization of Iterative-Based Speaker Diarization Systems for Telephone Conversations
Ben-Harush, Oshry
Ben-Harush, Ortal
Lapidot, Itshak
Guterman, Hugo
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 414 - 425
[5] ATTENTION-BASED NEURAL NETWORK FOR JOINT DIARIZATION AND SPEAKER EXTRACTION
Chazan, Shlomo E.
Gannot, Sharon
Goldberger, Jacob
[J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 301 - 305
[6] Randomization Effect on Iterative-Based Speaker Diarization System for Telephone Conversations
Furmanov, Tal
Aminov, Lidiya
Moyal, Ami
Lapidot, Itshak
[J]. 2014 IEEE 28TH CONVENTION OF ELECTRICAL & ELECTRONICS ENGINEERS IN ISRAEL (IEEEI), 2014,
[7] Robust Speaker Extraction Network based on Iterative Refined Adaptation
Deng, Chengyun
Ma, Shiqian
Sha, Yongtao
Zhang, Yi
Zhang, Hui
Song, Hui
Wang, Fei
[J]. INTERSPEECH 2021, 2021, : 3530 - 3534
[8] Audio-visual Speaker Diarization: Improved Voice Activity Detection with CNN based Feature Extraction
Fanaras, Konstantinos
Tragoudaras, Antonios
Antoniadis, Charalampos
Massoud, Yehia
[J]. 2022 IEEE 65TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS 2022), 2022,
[9] End-to-End Neural Speaker Diarization with an Iterative Refinement of Non-Autoregressive Attention-based Attractors
Rybicka, Magdalena
Villalba, Jesus
Dehak, Najim
Kowalczyk, Konrad
[J]. INTERSPEECH 2022, 2022, : 5090 - 5094
[10] Speaker Extraction using LCMV Beamformer with DNN-based SPP and RTF Identification Scheme
Malek, Ariel
Chazan, Shlomo E.
Malka, Ilan
Tourbabin, Vladimir
Goldberger, Jacob
Tzirkel-Hancock, Eli
Gannot, Sharon
[J]. 2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 2274 - 2278

← 1 →