Extending the Task of Diarization to Speaker Attribution

被引：0

作者：

Ghaemmaghami, Houman ^{[1
]}

Dean, David ^{[1
]}

Vogt, Robbie ^{[1
]}

Sridharan, Sridha ^{[1
]}

机构：

[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld 4001, Australia

来源：

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年

关键词：

speaker attribution; diarization; clustering; cross likelihood ratio; joint factor analysis;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we extend the concept of speaker annotation within a single-recording, or speaker diarization, to a collection wide approach we call speaker attribution. Accordingly, speaker attribution is the task of clustering expectantly homogenous intersession clusters obtained using diarization according to common cross-recording identities. The result of attribution is a collection of spoken audio across multiple recordings attributed to speaker identities. In this paper, an attribution system is proposed using mean-only MAP adaptation of a combined-gender UBM to model clusters from a perfect diarization system, as well as a JFA-based system with session variability compensation. The normalized cross-likelihood ratio is calculated for each pair of clusters to construct an attribution matrix and the complete linkage algorithm is employed to conduct clustering of the inter-session clusters. A matched cluster purity and coverage of 87.1% was obtained on the NIST 2008 SRE corpus.

引用

页码：1056 / 1059

页数：4

共 50 条

[21] Speaker-Corrupted Embeddings for Online Speaker Diarization
Ghahabi, Omid
Fischer, Volker
INTERSPEECH 2019, 2019, : 386 - 390
[22] Online Neural Speaker Diarization With Target Speaker Tracking
Wang, Weiqing
Li, Ming
IEEE/ACM Transactions on Audio Speech and Language Processing, 2024, 32 : 5078 - 5091
[23] Speaker Diarization and Linking of Meeting Data
Ferras, Marc
Madikeri, Srikanth
Bourlard, Herve
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 1935 - 1945
[24] Speaker Diarization Using Gesture and Speech
Gebre, Binyam Gebrekidan
Wittenburg, Peter
Drude, Sebastian
Huijbregts, Marijn
Heskes, Tom
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 582 - 586
[25] Phone Adaptive Training for Speaker Diarization
Bozonnet, Simon
Vipperla, Ravichander
Evans, Nicholas
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 494 - 497
[26] Group Delay Functions for Speaker Diarization
Yadav, Mohit
Sao, Anil Kumar
Dileep, A. D.
Rajan, Padmanabhan
2016 TWENTY SECOND NATIONAL CONFERENCE ON COMMUNICATION (NCC), 2016,
[27] Iterative PLDA Adaptation for Speaker Diarization
Le Lan, Gael
Charlet, Delphine
Larcher, Anthony
Meignier, Sylvain
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2175 - 2179
[28] Multistage speaker diarization of broadcast news
Barras, Claude
Zhu, Xuan
Meignier, Sylvain
Gauvain, Jean-Luc
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1505 - 1512
[29] An overview of automatic speaker diarization systems
Tranter, Sue E.
Reynolds, Douglas A.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1557 - 1565
[30] Speaker Diarization using Embedding Vectors
Toruk, Mesut
Bilgin, Gokhan
Serbes, Ahmet
2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,

← 1 2 3 4 5 →