Factor Analysis for Speaker Segmentation and Improved Speaker Diarization

被引:0
|
作者
Desplanques, Brecht [1 ]
Demuynck, Kris [1 ]
Martens, Jean-Pierre [1 ]
机构
[1] Ghent Univ iMinds, ELIS Multimedia Lab, Ghent, Belgium
关键词
speaker change detection; speaker diarization; clustering; segmentation; factor analysis;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speaker diarization includes two steps: speaker segmentation and speaker clustering. Speaker segmentation searches for speaker boundaries, whereas speaker clustering aims at grouping speech segments of the same speaker. In this work, the segmentation is improved by replacing the Bayesian Information Criterion (BIC) with a new iVector-based approach. Unlike BIC-based methods which trigger on any acoustic dissimilarities, the proposed method suppresses phonetic variations and accentuates speaker differences. More specifically our method generates boundaries based on the distance between two speaker factor vectors that are extracted on a frame-by frame basis. The extraction relies on an eigenvoice matrix so that large differences between speaker factor vectors indicate a different speaker. A Mahalanobis-based distance measure, in which the covariance matrix compensates for the remaining and detrimental phonetic variability, is shown to generate accurate boundaries. The detected segments are clustered by a state-of-the-art iVector Probabilistic Linear Discriminant Analysis system. Experiments on the COST278 multilingual broadcast news database show relative reductions of 50% in boundary detection errors. The speaker error rate is reduced by 8% relative.
引用
收藏
页码:3081 / 3085
页数:5
相关论文
共 50 条
  • [1] Bayes Factor Based Speaker Segmentation for Speaker Diarization
    Wang, D.
    Vogt, R.
    Sridharan, S.
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1405 - 1408
  • [2] An Improved Speaker Diarization System
    Fu, Rong
    Benest, Ian D.
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1253 - 1256
  • [3] Analysis of Phonetic Dependence of Segmentation Errors in Speaker Diarization
    McKnight, Simon W.
    Hogg, Aidan O. T.
    Naylor, Patrick A.
    [J]. 28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 381 - 385
  • [4] IMPROVED SPEAKER DIARIZATION SYSTEM FOR MEETINGS
    El-Khoury, Elie
    Senac, Christine
    Pinquier, Julien
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4097 - 4100
  • [5] Experiments with Segmentation in an Online Speaker Diarization System
    Kunesova, Marie
    Zajic, Zbynek
    Radova, Vlasta
    [J]. TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 429 - 437
  • [6] Optimized speaker change detection approach for speaker segmentation towards speaker diarization based on deep learning
    VijayKumar, K.
    Rao, R. Rajeswara
    [J]. DATA & KNOWLEDGE ENGINEERING, 2023, 144
  • [7] SPEAKER DIARIZATION THROUGH SPEAKER EMBEDDINGS
    Rouvier, Mickael
    Bousquet, Pierre-Michel
    Favre, Benoit
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2082 - 2086
  • [8] System output combination for improved speaker diarization
    Bozonnet, Simon
    Evans, Nicholas
    Anguera, Xavier
    Vinyals, Oriol
    Friedland, Gerald
    Fredouille, Corinne
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2650 - +
  • [9] Improved Location Features for Meeting Speaker Diarization
    Otterson, Scott
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2928 - 2931
  • [10] IMPROVED BINARY KEY SPEAKER DIARIZATION SYSTEM
    Delgado, Hector
    Anguera, Xavier
    Fredouille, Corinne
    Serrano, Javier
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2087 - 2091