Robust Speaker Diarization for Short Speech Recordings

被引:11
|
作者
Imseng, David [1 ,2 ]
Friedland, Gerald [3 ]
机构
[1] Idiap Res Inst, POB 592, CH-1920 Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
[3] Int Comp Sci Inst, Berkeley, CA 94704 USA
关键词
D O I
10.1109/ASRU.2009.5373254
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate a state-of-the-art Speaker Diarization system regarding its behavior on meetings that are much shorter (from 500 seconds down to 100 seconds) than those typically analyzed in Speaker Diarization benchmarks. First, the problems inherent to this task are analyzed. Then, we propose an approach that consists of a novel initialization parameter estimation method for typical state-of-the-art diarization approaches. The estimation method balances the relationship between the optimal value of the duration of speech data per Gaussian and the duration of the speech data, which is verified experimentally for the first time in this article. As a result, the Diarization Error Rate for short meetings extracted from the 2006, 2007, and 2009 NIST RT evaluation data is decreased by up to 50 % relative.
引用
收藏
页码:432 / +
页数:2
相关论文
共 50 条
  • [21] The X-Lance Speaker Diarization System for the Conversational Short-phrase Speaker Diarization Challenge 2022
    Liu, Tao
    Xiang, Xu
    Chen, Zhengyang
    Han, Bing
    Yu, Kai
    Qian, Yanmin
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 498 - 501
  • [22] Speaker diarization:: Towards a more robust and portable system
    El Khoury, Elie
    Senac, Christine
    Andre-Obrecht, Regine
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 489 - +
  • [23] Tuning-Robust Initialization Methods for Speaker Diarization
    Imseng, David
    Friedland, Gerald
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 2028 - 2037
  • [24] Information Bottleneck Features for HMM/GMM Speaker Diarization of Meetings Recordings
    Yella, Sree Harsha
    Valente, Fabio
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 960 - 963
  • [25] Robust speaker clustering strategies to data source variation for improved speaker diarization
    Han, Kyu J.
    Kim, Samuel
    Narayanan, Shrikanth S.
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 262 - 267
  • [26] Overlapped speech detection for improved speaker diarization in multiparty meetings
    Boakye, Kofi
    Trueba-Hornero, Beatriz
    Vinyals, Oriol
    Friedland, Gerald
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4353 - 4356
  • [27] Methodologies for the evaluation of Speaker Diarization and Automatic Speech Recognition in the presence of overlapping speech
    Galibert, Olivier
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1130 - 1133
  • [28] Revolutionizing Speaker Recognition and Diarization: A Novel Methodology in Speech Analysis
    Ravi D. Shankar
    R. B. Manjula
    Rajashekhar C. Biradar
    SN Computer Science, 6 (1)
  • [29] Neural speech turn segmentation and affinity propagation for speaker diarization
    Yin, Ruiqing
    Bredin, Herve
    Barras, Claude
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1393 - 1397
  • [30] Joint Speech Recognition and Speaker Diarization via Sequence Transduction
    El Shafey, Laurent
    Soltau, Hagen
    Shafran, Izhak
    INTERSPEECH 2019, 2019, : 396 - 400