Robust Speaker Diarization for Short Speech Recordings

被引:11
|
作者
Imseng, David [1 ,2 ]
Friedland, Gerald [3 ]
机构
[1] Idiap Res Inst, POB 592, CH-1920 Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
[3] Int Comp Sci Inst, Berkeley, CA 94704 USA
关键词
D O I
10.1109/ASRU.2009.5373254
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate a state-of-the-art Speaker Diarization system regarding its behavior on meetings that are much shorter (from 500 seconds down to 100 seconds) than those typically analyzed in Speaker Diarization benchmarks. First, the problems inherent to this task are analyzed. Then, we propose an approach that consists of a novel initialization parameter estimation method for typical state-of-the-art diarization approaches. The estimation method balances the relationship between the optimal value of the duration of speech data per Gaussian and the duration of the speech data, which is verified experimentally for the first time in this article. As a result, the Diarization Error Rate for short meetings extracted from the 2006, 2007, and 2009 NIST RT evaluation data is decreased by up to 50 % relative.
引用
收藏
页码:432 / +
页数:2
相关论文
共 50 条
  • [1] Speaker Diarization of Overlapping Speech based on Silence Distribution in Meeting Recordings
    Yella, Harsha
    Valente, Fabio
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 490 - 493
  • [2] The Influence of Speech Activity Detection and Overlap on Speaker Diarization for Meeting Room Recordings
    Fredouille, Corinne
    Evans, Nicholas
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2704 - 2707
  • [3] An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings
    Serafini, Luca
    Cornell, Samuele
    Morrone, Giovanni
    Zovato, Enrico
    Brutti, Alessio
    Squartini, Stefano
    [J]. COMPUTER SPEECH AND LANGUAGE, 2023, 82
  • [4] SIMULTANEOUS SPEECH RECOGNITION AND SPEAKER DIARIZATION FOR MONAURAL DIALOGUE RECORDINGS WITH TARGET-SPEAKER ACOUSTIC MODELS
    Kanda, Naoyuki
    Horiguchi, Shota
    Fujita, Yusuke
    Xue, Yawen
    Nagamatsu, Kenji
    Watanabe, Shinji
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 31 - 38
  • [5] VARIATIONAL BAYESIAN SPEAKER DIARIZATION OF MEETING RECORDINGS
    Valente, Fabio
    Motlicek, Petr
    Vijayasenan, Deepu
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4954 - 4957
  • [6] Speaker Diarization Using Gesture and Speech
    Gebre, Binyam Gebrekidan
    Wittenburg, Peter
    Drude, Sebastian
    Huijbregts, Marijn
    Heskes, Tom
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 582 - 586
  • [7] Robust Speaker Diarization for News Broadcast
    Karthik, M. L. N. S.
    Ganesh, Mirishkar Sai
    Patnaik, Bijayananda
    [J]. 2018 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2018,
  • [8] TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
    Pang, Bowen
    Zhao, Huan
    Zhang, Gaosheng
    Yang, Xiaoyue
    Sun, Yang
    Zhang, Li
    Wang, Qing
    Xie, Lei
    [J]. 2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 502 - 506
  • [9] Neural Network Speaker Descriptor in Speaker Diarization of Telephone Speech
    Zajic, Zbynek
    Zelinka, Jan
    Mueller, Ludek
    [J]. SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 555 - 563
  • [10] Detection of Overlapping Speech for the Purposes of Speaker Diarization
    Kunesova, Marie
    Hruz, Marek
    Zajic, Zbynek
    Radova, Vlasta
    [J]. SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 247 - 257