Robust Speaker Diarization for Short Speech Recordings

被引:11
|
作者
Imseng, David [1 ,2 ]
Friedland, Gerald [3 ]
机构
[1] Idiap Res Inst, POB 592, CH-1920 Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
[3] Int Comp Sci Inst, Berkeley, CA 94704 USA
关键词
D O I
10.1109/ASRU.2009.5373254
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate a state-of-the-art Speaker Diarization system regarding its behavior on meetings that are much shorter (from 500 seconds down to 100 seconds) than those typically analyzed in Speaker Diarization benchmarks. First, the problems inherent to this task are analyzed. Then, we propose an approach that consists of a novel initialization parameter estimation method for typical state-of-the-art diarization approaches. The estimation method balances the relationship between the optimal value of the duration of speech data per Gaussian and the duration of the speech data, which is verified experimentally for the first time in this article. As a result, the Diarization Error Rate for short meetings extracted from the 2006, 2007, and 2009 NIST RT evaluation data is decreased by up to 50 % relative.
引用
收藏
页码:432 / +
页数:2
相关论文
共 50 条
  • [31] Speech Recognition and Multi-Speaker Diarization of Long Conversations
    Mao, Huanru Henry
    Li, Shuyang
    McAuley, Julian
    Cottrell, Garrison W.
    INTERSPEECH 2020, 2020, : 691 - 695
  • [32] Robust acoustic domain identification with its application to speaker diarization
    Kumar A.K.
    Waldekar S.
    Sahidullah M.
    Saha G.
    International Journal of Speech Technology, 2022, 25 (04) : 933 - 945
  • [33] PROGRESSIVE MULTI-TARGET NETWORK BASED SPEECH ENHANCEMENT WITH SNR-PRESELECTION FOR ROBUST SPEAKER DIARIZATION
    Sun, Lei
    Du, Jun
    Zhang, Xueyang
    Gao, Tian
    Fang, Xin
    Lee, Chin-Hui
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7099 - 7103
  • [34] Robust Statistical Processing of TDOA Estimates for Distant Speaker Diarization
    Parada, Pablo Peso
    Sharma, Dushyant
    van Waterschoot, Toon
    Naylor, Patrick A.
    2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 86 - 90
  • [35] Conversational Short-Phrase Speaker Diarization via Self-Adjusting Speech Segmentation and Embedding Extraction
    Lu, Haitian
    Cheng, Gaofeng
    Yan, Yonghong
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2340 - 2344
  • [36] IMPACT OF OVERLAPPING SPEECH DETECTION ON SPEAKER DIARIZATION FOR BROADCAST NEWS AND DEBATES
    Charlet, Delphine
    Barras, Claude
    Lienard, Jean-Sylvain
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7707 - 7711
  • [37] I-vector similarity based speech segmentation for interested speaker to speaker diarization system
    Bae, Ara
    Yoon, Ki-mu
    Jung, Jaehee
    Chung, Bokyung
    Kim, Wooil
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2020, 39 (05): : 461 - 467
  • [38] SEGMENTATION OF TV SHOWS INTO SCENES USING SPEAKER DIARIZATION AND SPEECH RECOGNITION
    Bredin, Herve
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 2377 - 2380
  • [39] Joint speaker diarization and speech recognition based on region proposal networks
    Huang, Zili
    Delcroix, Marc
    Garcia, Leibny Paola
    Watanabe, Shinji
    Raj, Desh
    Khudanpur, Sanjeev
    COMPUTER SPEECH AND LANGUAGE, 2022, 72
  • [40] Speech Overlap Detection in a Two-Pass Speaker Diarization System
    Huijbregts, Marijn
    van Leeuwen, David
    de Jong, Franciska
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1047 - +