VARIATIONAL BAYESIAN SPEAKER DIARIZATION OF MEETING RECORDINGS

被引:17
|
作者
Valente, Fabio [1 ]
Motlicek, Petr [1 ]
Vijayasenan, Deepu [1 ]
机构
[1] IDIAP Res Inst, CH-1920 Martigny, Switzerland
关键词
Variational Bayesian Methods; Speaker Diarization; Meetings Data;
D O I
10.1109/ICASSP.2010.5495087
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper investigates the use of the Variational Bayesian (VB) framework for speaker diarization of meetings data extending previous related works on Broadcast News audio. VB learning aims at maximizing a bound, known as Free Energy, on the model marginal likelihood and allows joint model learning and model selection according to the same objective function. While the BIC is valid only in the asymptotic limit, the Free Energy is always a valid bound. The paper proposes the use of Free Energy as objective function in speaker diarization. It can be used to select dynamically without any supervision or tuning, elements that typically affect the diarization performance i.e. the inferred number of speakers, the size of the GMM and the initialization. The proposed approach is compared with a conventional state-of-the-art system on the RT06 evaluation data for meeting recordings diarization and shows an improvement of 8.4% relative in terms of speaker error.
引用
收藏
页码:4954 / 4957
页数:4
相关论文
共 50 条
  • [1] VARIATIONAL BAYESIAN PLDA FOR SPEAKER DIARIZATION IN THE MGB CHALLENGE
    Villalba, Jesus
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 667 - 674
  • [2] Speaker Diarization of Overlapping Speech based on Silence Distribution in Meeting Recordings
    Yella, Harsha
    Valente, Fabio
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 490 - 493
  • [3] The Influence of Speech Activity Detection and Overlap on Speaker Diarization for Meeting Room Recordings
    Fredouille, Corinne
    Evans, Nicholas
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2704 - 2707
  • [4] SPEAKER DIARIZATION IN MEETING AUDIO
    Nwe, Tin Lay
    Sun, Hanwu
    Li, Haizhou
    Rahardja, Susanto
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4073 - 4076
  • [5] Robust Speaker Diarization for Short Speech Recordings
    Imseng, David
    Friedland, Gerald
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 432 - +
  • [6] Speaker Diarization and Linking of Meeting Data
    Ferras, Marc
    Madikeri, Srikanth
    Bourlard, Herve
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 1935 - 1945
  • [7] Speaker Diarization for Meeting Room Audio
    Sun, Hanwu
    Nwe, Tin Lay
    Ma, Bin
    Li, Haizhou
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 888 - 891
  • [8] BAYESIAN ANALYSIS OF SIMILARITY MATRICES FOR SPEAKER DIARIZATION
    Sholokhov, Alexey
    Pekhovsky, Timur
    Kudashev, Oleg
    Shulipa, Andrei
    Kinnunen, Tomi
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [9] Online Meeting Recognizer with Multichannel Speaker Diarization
    Araki, Shoko
    Hori, Takaaki
    Fujimoto, Masakiyo
    Watanabe, Shinji
    Yoshioka, Takuya
    Nakatani, Tomohiro
    Nakamura, Atsushi
    [J]. 2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 1697 - 1701
  • [10] Improved Location Features for Meeting Speaker Diarization
    Otterson, Scott
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2928 - 2931