Speaker Diarization and Linking of Meeting Data

被引:10
|
作者
Ferras, Marc [1 ]
Madikeri, Srikanth [1 ]
Bourlard, Herve [1 ]
机构
[1] Idiap Res Inst, CH-1920 Martigny, Switzerland
关键词
Gaussian mixture model (GMM); i-vector; information bottleneck (IB); joint factor analysis (JFA); speaker diarization; speaker linking; ward clustering;
D O I
10.1109/TASLP.2016.2590139
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Finding who spoke when in a collection of recordings, with speakers being uniquely identified across the database, is a challenging task. In this scenario, reasonable computing times and acoustic variation across recordings remain two major concerns to address in state-of-the-art speaker diarization systems. This paper extends prior work on diarizing large speech datasets using algorithms that scale well with increasing amounts of data while compensating for across-recording variability. We follow a two-stage approach performing speaker diarization and speaker linking, the former focusing on local within-recording speaker changes and the latter focusing on global speaker changes across the database. In this study, we explore how these two modules interact with each other, while proposing a diarization fusion approach that prevents diarization errors from propagating to the linking stage. We further explore the diarization fusion for speaker linking using different linking strategies and speaker modeling variants. Evaluation is performed on single distant microphone data from the augmented multiparty interaction corpus show the effectiveness of the fusion approach after speaker linking and intersession variability modeling via joint factor analysis.
引用
收藏
页码:1935 / 1945
页数:11
相关论文
共 50 条
  • [1] ADAPTIVE AND ONLINE SPEAKER DIARIZATION FOR MEETING DATA
    Soldi, Giovanni
    Beaugeant, Christophe
    Evans, Nicholas
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2112 - 2116
  • [2] An Information Theoretic Approach to Speaker Diarization of Meeting Data
    Vijayasenan, Deepu
    Valente, Fabio
    Bourlard, Herve
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (07): : 1382 - 1393
  • [3] SPHEREDIAR: AN EFFECTIVE SPEAKER DIARIZATION SYSTEM FOR MEETING DATA
    Kaseva, Tuomas
    Rouhe, Aku
    Kurimo, Mikko
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 373 - 380
  • [4] Progress in the AMIDA speaker diarization system for meeting data
    van Leeuwen, David A.
    Konecny, Matej
    [J]. MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 475 - 483
  • [5] SPEAKER DIARIZATION IN MEETING AUDIO
    Nwe, Tin Lay
    Sun, Hanwu
    Li, Haizhou
    Rahardja, Susanto
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4073 - 4076
  • [6] SPEAKER DIARIZATION AND LINKING OF LARGE CORPORA
    Ferras, Marc
    Bourlard, Herve
    [J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 280 - 285
  • [7] Speaker Diarization for Meeting Room Audio
    Sun, Hanwu
    Nwe, Tin Lay
    Ma, Bin
    Li, Haizhou
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 888 - 891
  • [8] Online Meeting Recognizer with Multichannel Speaker Diarization
    Araki, Shoko
    Hori, Takaaki
    Fujimoto, Masakiyo
    Watanabe, Shinji
    Yoshioka, Takuya
    Nakatani, Tomohiro
    Nakamura, Atsushi
    [J]. 2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 1697 - 1701
  • [9] Improved Location Features for Meeting Speaker Diarization
    Otterson, Scott
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2928 - 2931
  • [10] VARIATIONAL BAYESIAN SPEAKER DIARIZATION OF MEETING RECORDINGS
    Valente, Fabio
    Motlicek, Petr
    Vijayasenan, Deepu
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4954 - 4957