Improving Speaker Segmentation via Speaker Identification and Text Segmentation

被引：0

作者：

Li, Runxin ^{[1
]}

Schultz, Tanja ^{[1
]}

Jin, Qin ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Language Technol Inst, InterACT, Pittsburgh, PA 15213 USA

来源：

INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年

关键词：

speaker diarization; speaker segmentation; speaker identification; text segmentation;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speaker segmentation is an essential part of a speaker diarization system. Common segmentation systems usually miss speaker change points when speakers switch fast. These errors seriously confuse the following speaker clustering step and result in high overall speaker diarization error rates. In this paper two methods are proposed to deal with this problem: The first approach uses speaker identification techniques to boost speaker segmentation. And the second approach applies text segmentation methods to improve the performance of speaker segmentation. Experiments on Quaero speaker diarization evaluation data shows that our methods achieve up to 45% relative reduction in the speaker diarization error and 64% relative increase in the speaker change detection recall rate over the baseline system. Moreover, both these two approaches can be considered as post-processing steps over the baseline segmentation, therefore, they can be applied in any speaker diarization systems.

引用

页码：928 / 931

页数：4

共 50 条

[1] Speaker segmentation and clustering
Kotti, Margarita
Moschou, Vassiliki
Kotropoulos, Constantine
[J]. SIGNAL PROCESSING, 2008, 88 (05) : 1091 - 1124
[2] Bayes Factor Based Speaker Segmentation for Speaker Diarization
Wang, D.
Vogt, R.
Sridharan, S.
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1405 - 1408
[3] Factor Analysis for Speaker Segmentation and Improved Speaker Diarization
Desplanques, Brecht
Demuynck, Kris
Martens, Jean-Pierre
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3081 - 3085
[4] Bayes Factor Based Speaker Segmentation for Speaker Diarization
Speech and Audio Research Laboratory, Queensland University of Technology, Brisbane, Australia
[J]. Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH, (1405-1408):
[5] Joint Speaker Segmentation, Localization and Identification for Streaming Audio
Schmalenstroeer, Joerg
Haeb-Umbach, Reinhold
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 453 - 456
[6] Confidence Measures for Speaker Segmentation and their Relation to Speaker Verification
Vaquero, Carlos
Ortega, Alfonso
Villalba, Jesus
Miguel, Antonio
Lleida, Eduardo
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2310 - 2313
[7] SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS
Wang, Renyu
Gu, Mingliang
Li, Lantian
Xu, Mingxing
Zheng, Thoms Fang
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5420 - 5424
[8] Using Phoneme Recognition and Text-dependent Speaker Verification to Improve Speaker Segmentation for Chinese Speech
Wang, Gang
Wu, Xiaojun
Zheng, Thomas Fang
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1457 - 1460
[9] Voting for two speaker segmentation
Narayanaswamy, Balakrishnan
Gangadharaiah, Rashmi
Stern, Richard
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2086 - +
[10] Location based speaker segmentation
Lathoud, G
McCowan, IA
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 176 - 179

← 1 2 3 4 5 →