Bayesian HMM based x-vector clustering for Speaker Diarization

被引：29

作者：

Diez, Mireia ^{[1
]}

Burget, Lukas ^{[1
]}

Wang, Shuai ^{[1
,2
]}

Rohdin, Johan ^{[1
]}

Cernocky, Jan ^{[1
]}

机构：

[1] Brno Univ Technol, Fac Informat Technol, IT4I Ctr Excellence, Brno, Czech Republic

[2] Shanghai Jiao Tong Univ, Speechlab, Dept Comp Sci & Engn, Shanghai, Peoples R China

来源：

INTERSPEECH 2019 | 2019年

基金：

欧盟地平线“2020”; 美国国家科学基金会;

关键词：

Speaker Diarization; Variational Bayes; HMM; x-vector; DIHARD;

D O I：

10.21437/Interspeech.2019-2813

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

This paper presents a simplified version of the previously proposed diarization algorithm based on Bayesian Hidden Markov Models, which uses Variational Bayesian inference for very fast and robust clustering of x-vector (neural network based speaker embeddings). The presented results show that this clustering algorithm provides significant improvements in diarization performance as compared to the previously used Agglomerative Hierarchical Clustering. The output of this system can be further employed as an initialization for a second stage VB diarization system, using frame-wise MFCC features as input, to obtain optimal results.

引用

页码：346 / 350

页数：5

共 50 条

[1] Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks
Landini, Federico
Profant, Jan
Diez, Mireia
Burget, Lukas
[J]. COMPUTER SPEECH AND LANGUAGE, 2022, 71
[2] OPTIMIZING BAYESIAN HMM BASED X-VECTOR CLUSTERING FOR THE SECOND DIHARD SPEECH DIARIZATION CHALLENGE
Diez, Mireia
Burget, Lukas
Landini, Federico
Wang, Shuai
Cernocky, Honza
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6519 - 6523
[3] Analysis of Speaker Diarization Based on Bayesian HMM With Eigenvoice Priors
Diez, Mireia
Burget, Lukas
Landini, Federico
Cernocky, Jan
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 355 - 368
[4] Design Choices for X-vector Based Speaker Anonymization
Srivastava, Brij Mohan Lal
Tomashenko, N.
Wang, Xin
Vincent, Emmanuel
Yamagishi, Junichi
Maouche, Mohamed
Bellet, Aurelien
Tommasi, Marc
[J]. INTERSPEECH 2020, 2020, : 1713 - 1717
[5] Privacy and Utility of X-Vector Based Speaker Anonymization
Srivastava, Brij Mohan Lal
Maouche, Mohamed
Sahidullah, Md
Vincent, Emmanuel
Bellet, Aurelien
Tommasi, Marc
Tomashenko, Natalia
Wang, Xin
Yamagishi, Junichi
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2383 - 2395
[6] A Study of X-vector Based Speaker Recognition on Short Utterances
Kanagasundaram, A.
Sridharan, S.
Sriram, G.
Prachi, S.
Fookes, C.
[J]. INTERSPEECH 2019, 2019, : 2943 - 2947
[7] Research on x-vector speaker recognition algorithm based on Kaldi
Zhao, Hong
Yue, Lupeng
Wang, Weijie
Zeng, Xiangyan
[J]. INTERNATIONAL JOURNAL OF COMPUTING SCIENCE AND MATHEMATICS, 2022, 15 (03) : 199 - 212
[8] Multi-task learning for X-vector based speaker recognition
Zhang Y.
Liu L.
[J]. International Journal of Speech Technology, 2023, 26 (04) : 817 - 823
[9] Speaker Recognition using Multiple X-Vector Speaker Representations with Two-Stage Clustering and Outlier Detection Refinement
Shrestha, Roman
Glackin, Cornelius
Wall, Julie
Cannings, Nigel
Rajwadi, Marvin
Kada, Satya
Laird, James
Laird, Thea
Woodruff, Chris
[J]. 2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 330 - 335
[10] KL-HMM BASED SPEAKER DIARIZATION SYSTEM FOR MEETINGS
Madikeri, Srikanth
Bourlard, Herve
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4435 - 4439

← 1 2 3 4 5 →