Multistage speaker diarization of broadcast news

被引:127
|
作者
Barras, Claude [1 ]
Zhu, Xuan [1 ]
Meignier, Sylvain [1 ]
Gauvain, Jean-Luc [1 ]
机构
[1] CNRS, Comp Sci Lab Mech & Engn Sci, F-91403 Orsay, France
关键词
Bayesian information criterion (BIC) clustering; speaker diarization; speaker identification (SID); speaker segmentation and clustering;
D O I
10.1109/TASL.2006.878261
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes recent advances in speaker diarization with a multistage segmentation and clustering system, which incorporates a speaker identification step. This system builds upon the baseline audio partitioner used in the LIMSI broadcast news transcription system. The baseline partitioner provides a high cluster purity, but has a tendency to split data from speakers with a large quantity of data into several segment clusters. Several improvements to the baseline system have been made. First, the iterative Gaussian mixture model (GNM) clustering has been replaced by a Bayesian information criterion (BIC) agglomerative clustering. Second, an additional clustering stage has been added, using a GMM-based speaker identification method. Finally, a post-processing stage refines the segment boundaries using the output of a transcription system. On the National Institute of Standards and Technology (NIST) RT-04F and ESTER evaluation data, the multistage system reduces the speaker error by over 70% relative to the baseline system, and gives'between 40% and 50% reduction relative to a single-stage BIC clustering system.
引用
收藏
页码:1505 / 1512
页数:8
相关论文
共 50 条
  • [1] Speaker diarization of French broadcast news
    Gupta, Vishwa
    Boulianne, Gilles
    Kenny, Patrick
    Ouellet, Pierre
    Dumouchel, Pierre
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4365 - 4368
  • [2] Robust Speaker Diarization for News Broadcast
    Karthik, M. L. N. S.
    Ganesh, Mirishkar Sai
    Patnaik, Bijayananda
    [J]. 2018 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2018,
  • [3] Speaker diarization: From broadcast news to lectures
    Zhu, X.
    Barras, C.
    Lamel, L.
    Gauvain, J-L.
    [J]. MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 396 - +
  • [4] Speaker Diarization in Broadcast News Using SubGlottal Resonances
    Kadijani, Homa Afaghi
    Razzazi, Farbod
    [J]. 2019 5TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS 2019), 2019,
  • [5] Speaker diarization of broadcast news in Albayzin 2010 evaluation campaign
    Martin Zelenák
    Henrik Schulz
    Javier Hernando
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2012
  • [6] Speaker diarization of broadcast news in Albayzin 2010 evaluation campaign
    Zelenak, Martin
    Schulz, Henrik
    Hernando, Javier
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2012,
  • [7] Adaptive speaker diarization of broadcast news based on factor analysis
    Desplanques, Brecht
    Demuynck, Kris
    Martens, Jean-Pierre
    [J]. COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 72 - 93
  • [8] IMPACT OF OVERLAPPING SPEECH DETECTION ON SPEAKER DIARIZATION FOR BROADCAST NEWS AND DEBATES
    Charlet, Delphine
    Barras, Claude
    Lienard, Jean-Sylvain
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7707 - 7711
  • [9] Step-by-step and integrated approaches in broadcast news speaker diarization
    Meignier, S
    Moraru, D
    Fredouille, C
    Bonastre, JF
    Besacier, L
    [J]. COMPUTER SPEECH AND LANGUAGE, 2006, 20 (2-3): : 303 - 330
  • [10] A CLUSTER-VOTING APPROACH FOR SPEAKER DIARIZATION AND LINKING OF AUSTRALIAN BROADCAST NEWS RECORDINGS
    Ghaemmaghami, Houman
    Dean, David
    Sridharan, Sridha
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4829 - 4833