Robust Speaker Diarization for Short Speech Recordings

被引：11

作者：

Imseng, David ^{[1
,2
]}

Friedland, Gerald ^{[3
]}

机构：

[1] Idiap Res Inst, POB 592, CH-1920 Martigny, Switzerland

[2] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland

[3] Int Comp Sci Inst, Berkeley, CA 94704 USA

来源：

2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009) | 2009年

关键词：

D O I：

10.1109/ASRU.2009.5373254

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We investigate a state-of-the-art Speaker Diarization system regarding its behavior on meetings that are much shorter (from 500 seconds down to 100 seconds) than those typically analyzed in Speaker Diarization benchmarks. First, the problems inherent to this task are analyzed. Then, we propose an approach that consists of a novel initialization parameter estimation method for typical state-of-the-art diarization approaches. The estimation method balances the relationship between the optimal value of the duration of speech data per Gaussian and the duration of the speech data, which is verified experimentally for the first time in this article. As a result, the Diarization Error Rate for short meetings extracted from the 2006, 2007, and 2009 NIST RT evaluation data is decreased by up to 50 % relative.

引用

页码：432 / +

页数：2

共 50 条

[1] Speaker Diarization of Overlapping Speech based on Silence Distribution in Meeting Recordings
Yella, Harsha
Valente, Fabio
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 490 - 493
[2] The Influence of Speech Activity Detection and Overlap on Speaker Diarization for Meeting Room Recordings
Fredouille, Corinne
Evans, Nicholas
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2704 - 2707
[3] An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings
Serafini, Luca
Cornell, Samuele
Morrone, Giovanni
Zovato, Enrico
Brutti, Alessio
Squartini, Stefano
COMPUTER SPEECH AND LANGUAGE, 2023, 82
[4] SIMULTANEOUS SPEECH RECOGNITION AND SPEAKER DIARIZATION FOR MONAURAL DIALOGUE RECORDINGS WITH TARGET-SPEAKER ACOUSTIC MODELS
Kanda, Naoyuki
Horiguchi, Shota
Fujita, Yusuke
Xue, Yawen
Nagamatsu, Kenji
Watanabe, Shinji
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 31 - 38
[5] VARIATIONAL BAYESIAN SPEAKER DIARIZATION OF MEETING RECORDINGS
Valente, Fabio
Motlicek, Petr
Vijayasenan, Deepu
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4954 - 4957
[6] Speaker Diarization Using Gesture and Speech
Gebre, Binyam Gebrekidan
Wittenburg, Peter
Drude, Sebastian
Huijbregts, Marijn
Heskes, Tom
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 582 - 586
[7] Robust Speaker Diarization for News Broadcast
Karthik, M. L. N. S.
Ganesh, Mirishkar Sai
Patnaik, Bijayananda
2018 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2018,
[8] Neural Network Speaker Descriptor in Speaker Diarization of Telephone Speech
Zajic, Zbynek
Zelinka, Jan
Mueller, Ludek
SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 555 - 563
[9] TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
Pang, Bowen
Zhao, Huan
Zhang, Gaosheng
Yang, Xiaoyue
Sun, Yang
Zhang, Li
Wang, Qing
Xie, Lei
2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 502 - 506
[10] Detection of Overlapping Speech for the Purposes of Speaker Diarization
Kunesova, Marie
Hruz, Marek
Zajic, Zbynek
Radova, Vlasta
SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 247 - 257

← 1 2 3 4 5 →