Bayesian HMM based x-vector clustering for Speaker Diarization

被引:29
|
作者
Diez, Mireia [1 ]
Burget, Lukas [1 ]
Wang, Shuai [1 ,2 ]
Rohdin, Johan [1 ]
Cernocky, Jan [1 ]
机构
[1] Brno Univ Technol, Fac Informat Technol, IT4I Ctr Excellence, Brno, Czech Republic
[2] Shanghai Jiao Tong Univ, Speechlab, Dept Comp Sci & Engn, Shanghai, Peoples R China
来源
基金
欧盟地平线“2020”; 美国国家科学基金会;
关键词
Speaker Diarization; Variational Bayes; HMM; x-vector; DIHARD;
D O I
10.21437/Interspeech.2019-2813
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper presents a simplified version of the previously proposed diarization algorithm based on Bayesian Hidden Markov Models, which uses Variational Bayesian inference for very fast and robust clustering of x-vector (neural network based speaker embeddings). The presented results show that this clustering algorithm provides significant improvements in diarization performance as compared to the previously used Agglomerative Hierarchical Clustering. The output of this system can be further employed as an initialization for a second stage VB diarization system, using frame-wise MFCC features as input, to obtain optimal results.
引用
收藏
页码:346 / 350
页数:5
相关论文
共 50 条
  • [1] Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks
    Landini, Federico
    Profant, Jan
    Diez, Mireia
    Burget, Lukas
    [J]. COMPUTER SPEECH AND LANGUAGE, 2022, 71
  • [2] OPTIMIZING BAYESIAN HMM BASED X-VECTOR CLUSTERING FOR THE SECOND DIHARD SPEECH DIARIZATION CHALLENGE
    Diez, Mireia
    Burget, Lukas
    Landini, Federico
    Wang, Shuai
    Cernocky, Honza
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6519 - 6523
  • [3] Analysis of Speaker Diarization Based on Bayesian HMM With Eigenvoice Priors
    Diez, Mireia
    Burget, Lukas
    Landini, Federico
    Cernocky, Jan
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 355 - 368
  • [4] Design Choices for X-vector Based Speaker Anonymization
    Srivastava, Brij Mohan Lal
    Tomashenko, N.
    Wang, Xin
    Vincent, Emmanuel
    Yamagishi, Junichi
    Maouche, Mohamed
    Bellet, Aurelien
    Tommasi, Marc
    [J]. INTERSPEECH 2020, 2020, : 1713 - 1717
  • [5] Privacy and Utility of X-Vector Based Speaker Anonymization
    Srivastava, Brij Mohan Lal
    Maouche, Mohamed
    Sahidullah, Md
    Vincent, Emmanuel
    Bellet, Aurelien
    Tommasi, Marc
    Tomashenko, Natalia
    Wang, Xin
    Yamagishi, Junichi
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2383 - 2395
  • [6] A Study of X-vector Based Speaker Recognition on Short Utterances
    Kanagasundaram, A.
    Sridharan, S.
    Sriram, G.
    Prachi, S.
    Fookes, C.
    [J]. INTERSPEECH 2019, 2019, : 2943 - 2947
  • [7] Research on x-vector speaker recognition algorithm based on Kaldi
    Zhao, Hong
    Yue, Lupeng
    Wang, Weijie
    Zeng, Xiangyan
    [J]. INTERNATIONAL JOURNAL OF COMPUTING SCIENCE AND MATHEMATICS, 2022, 15 (03) : 199 - 212
  • [8] Multi-task learning for X-vector based speaker recognition
    Zhang Y.
    Liu L.
    [J]. International Journal of Speech Technology, 2023, 26 (04) : 817 - 823
  • [9] Speaker Recognition using Multiple X-Vector Speaker Representations with Two-Stage Clustering and Outlier Detection Refinement
    Shrestha, Roman
    Glackin, Cornelius
    Wall, Julie
    Cannings, Nigel
    Rajwadi, Marvin
    Kada, Satya
    Laird, James
    Laird, Thea
    Woodruff, Chris
    [J]. 2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 330 - 335
  • [10] KL-HMM BASED SPEAKER DIARIZATION SYSTEM FOR MEETINGS
    Madikeri, Srikanth
    Bourlard, Herve
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4435 - 4439