Stream-based speaker segmentation using speaker factors and eigenvoices

被引：46

作者：

Castaldo, Fabio ^{[1
]}

Colibro, Daniele ^{[2
]}

Dalmasso, Emanuele ^{[1
]}

Laface, Pietro ^{[1
]}

Vair, Claudio ^{[2
]}

机构：

[1] Politecn Torino, I-10129 Turin, Italy

[2] Loquendo, Turin, Italy

来源：

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年

关键词：

speaker modeling; speaker segmentation; speaker factors; eigenvoices; speaker clustering;

D O I：

10.1109/ICASSP.2008.4518564

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a stream-based approach for unsupervised multi-speaker conversational speech segmentation. The main idea of this work is to exploit prior knowledge about the speaker space to find a low dimensional vector of speaker factors that summarize the salient speaker characteristics. This new approach produces segmentation error rates that are better than the state of the art ones reported in our previous work on the segmentation task in the NIST 2000 Speaker Recognition Evaluation (SRE). We also show how the performance of a speaker recognition system in the core test of the 2006 NIST SRE is affected, comparing the results obtained using single speaker and automatically segmented test data.

引用

页码：4133 / +

页数：2

共 50 条

[1] Combining eigenvoices and structural MLLR for speaker adaptation
Lauri, F
Illina, I
Fohr, D
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 580 - 583
[2] Bayes Factor Based Speaker Segmentation for Speaker Diarization
Speech and Audio Research Laboratory, Queensland University of Technology, Brisbane, Australia
[J]. Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH, (1405-1408):
[3] Bayes Factor Based Speaker Segmentation for Speaker Diarization
Wang, D.
Vogt, R.
Sridharan, S.
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1405 - 1408
[4] SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS
Wang, Renyu
Gu, Mingliang
Li, Lantian
Xu, Mingxing
Zheng, Thoms Fang
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5420 - 5424
[5] Speaker Segmentation System Using Eigenvoice-based Speaker Weight Distance Method
Choi, Mu Yeol
Kim, Hyung Soon
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2012, 31 (04): : 266 - 272
[6] Using confidence measures to evaluate the speaker turns in speaker segmentation
Chu, Wei
Liu, Jia
[J]. 2007 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1-3, 2007, : 728 - 731
[7] Location based speaker segmentation
Lathoud, G
McCowan, IA
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 176 - 179
[8] Location based speaker segmentation
Lathoud, G
McCowan, IA
[J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 621 - 624
[9] Minimum classification error/eigenvoices training for speaker identification
Valente, F
Wellekens, C
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 213 - 216
[10] UBM based speaker segmentation and clustering for 2-speaker detection
Deng, Jing
Zheng, Thomas Fang
Wu, Wenhu
[J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 116 - +

← 1 2 3 4 5 →