Speaker indexing and adaptation using speaker clustering based on statistical model selection

被引：0

作者：

Nishida, M ^{[1
]}

Kawahara, T ^{[1
]}

机构：

[1] Chiba Univ, Grad Sch Sci & Technol, Inage Ku, Chiba 2638522, Japan

来源：

2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING | 2004年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper addresses unsupervised speaker indexing and automatic speech recognition of discussions. In speaker indexing, there are two cases, where the number of speakers is unknown and known beforehand. When the specified number is unknown, it is difficult to apply to various data because it needs to determine several parameters like threshold. In addition, serious problems arise in applying a uniform model because variations in the utterance durations of speakers are large. We thus propose a method which can robustly perform speaker indexing for the two cases using a flexible framework in which an optimal speaker model (GMM or VQ) is selected based on the BIC. Moreover, we propose a combination method of speaker adaptation based on speaker selection and the indexing method. For real discussion archives, we demonstrated that indexing performance is higher than that of conventional methods for the two cases and speech recognition performance was improved by the combination method.

引用

页码：353 / 356

页数：4

共 50 条

[1] Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion
Nishida, M
Kawahara, T
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 172 - 175
[2] Speaker adaptation for telephony data using speaker clustering
Wu, C
Lubensky, D
Wang, ZH
[J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 768 - 771
[3] Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing
Nishida, M
Kawahara, T
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (04): : 583 - 592
[4] UBM based speaker selection and model re-estimation for speaker adaptation
Wang, Jian
Guo, Jun
Liu, Gang
Lei, Jianjun
[J]. PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 856 - 860
[5] Kernel-based speaker clustering for rapid speaker adaptation
Hazrati, Dooz
Ahadi, S. M.
Sadjadi, Omid
[J]. PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, 2008, : 1287 - 1289
[6] SVM based speaker selection using GMM supervector for rapid speaker adaptation
Wang, Jian
Lei, Jianjun
Guo, Jun
Yang, Zhen
[J]. SIMULATED EVOLUTION AND LEARNING, PROCEEDINGS, 2006, 4247 : 617 - 624
[7] Rapid Unsupervised Speaker Adaptation Using Single Utterance Based on MLLR and Speaker Selection
Gomez, Randy
Toda, Tomoki
Saruwatari, Hiroshi
Shikano, Kiyohiro
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1365 - 1368
[8] Speaker Adaptation Using i-Vector Based Clustering
Kim, Minsoo
Jang, Gil-Jin
Kim, Ji-Hwan
Lee, Minho
[J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (07): : 2785 - 2799
[9] EFFICIENT SPEAKER IDENTIFICATION USING DISTRIBUTIONAL SPEAKER MODEL CLUSTERING
Apsingekar, Vijendra Raj
De Leon, Phillip L.
[J]. 2008 42ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1-4, 2008, : 1260 - 1264
[10] Speaker Clustering Performance Improvement using Eigen-Voice Speaker Adaptation
Moattar, M. H.
Homayounpour, M. M.
[J]. 2009 14TH INTERNATIONAL COMPUTER CONFERENCE, 2009, : 500 - 505

← 1 2 3 4 5 →