Stream-based speaker segmentation using speaker factors and eigenvoices

被引:46
|
作者
Castaldo, Fabio [1 ]
Colibro, Daniele [2 ]
Dalmasso, Emanuele [1 ]
Laface, Pietro [1 ]
Vair, Claudio [2 ]
机构
[1] Politecn Torino, I-10129 Turin, Italy
[2] Loquendo, Turin, Italy
关键词
speaker modeling; speaker segmentation; speaker factors; eigenvoices; speaker clustering;
D O I
10.1109/ICASSP.2008.4518564
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a stream-based approach for unsupervised multi-speaker conversational speech segmentation. The main idea of this work is to exploit prior knowledge about the speaker space to find a low dimensional vector of speaker factors that summarize the salient speaker characteristics. This new approach produces segmentation error rates that are better than the state of the art ones reported in our previous work on the segmentation task in the NIST 2000 Speaker Recognition Evaluation (SRE). We also show how the performance of a speaker recognition system in the core test of the 2006 NIST SRE is affected, comparing the results obtained using single speaker and automatically segmented test data.
引用
收藏
页码:4133 / +
页数:2
相关论文
共 50 条
  • [1] Combining eigenvoices and structural MLLR for speaker adaptation
    Lauri, F
    Illina, I
    Fohr, D
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 580 - 583
  • [2] Bayes Factor Based Speaker Segmentation for Speaker Diarization
    Speech and Audio Research Laboratory, Queensland University of Technology, Brisbane, Australia
    [J]. Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH, (1405-1408):
  • [3] Bayes Factor Based Speaker Segmentation for Speaker Diarization
    Wang, D.
    Vogt, R.
    Sridharan, S.
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1405 - 1408
  • [4] SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS
    Wang, Renyu
    Gu, Mingliang
    Li, Lantian
    Xu, Mingxing
    Zheng, Thoms Fang
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5420 - 5424
  • [5] Speaker Segmentation System Using Eigenvoice-based Speaker Weight Distance Method
    Choi, Mu Yeol
    Kim, Hyung Soon
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2012, 31 (04): : 266 - 272
  • [6] Using confidence measures to evaluate the speaker turns in speaker segmentation
    Chu, Wei
    Liu, Jia
    [J]. 2007 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1-3, 2007, : 728 - 731
  • [7] Location based speaker segmentation
    Lathoud, G
    McCowan, IA
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 176 - 179
  • [8] Location based speaker segmentation
    Lathoud, G
    McCowan, IA
    [J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 621 - 624
  • [9] Minimum classification error/eigenvoices training for speaker identification
    Valente, F
    Wellekens, C
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 213 - 216
  • [10] UBM based speaker segmentation and clustering for 2-speaker detection
    Deng, Jing
    Zheng, Thomas Fang
    Wu, Wenhu
    [J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 116 - +