Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion

被引：0

作者：

Nishida, M ^{[1
]}

Kawahara, T ^{[1
]}

机构：

[1] JST, PRESTO, Sakyo Ku, Kyoto 6068501, Japan

来源：

2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I | 2003年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper addresses unsupervised speaker indexing for discussion Audio archives. In discussions, the speaker changes frequently, thus the duration of utterances is very short and its variation is large, which causes significant problems in applying conventional methods such as model adaptation and Variance-BIC (Bayesian Information Criterion) methods. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the BIC according to the duration of utterances. When the speech segment is short, the simple and robust VQ-based method is expected to be chosen, while GMM will be reliably trained for long segments. For a discussion archive having a total duration of 10 hours; it is demonstrated that the proposed method achieves higher indexing performance than that of conventional methods.

引用

页码：172 / 175

页数：4

共 50 条

[1] Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing
Nishida, M
Kawahara, T
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (04): : 583 - 592
[2] A Bayesian Information Criterion Based Approach for Model Complexity Selection in Speaker Identification
Geng, Yun-Xiao
Wu, Wei
[J]. ALPIT 2008: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 264 - 268
[3] Speaker indexing and adaptation using speaker clustering based on statistical model selection
Nishida, M
Kawahara, T
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 353 - 356
[4] Speaker Clustering Based on Bayesian Information Criterion
Tsai, Wei-Ho
[J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2008, 24 (06) : 1873 - 1886
[5] Unsupervised speaker indexing using generic models
Kwon, S
Narayanan, S
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 1004 - 1013
[6] Robust bootstrapping of speaker models for unsupervised speaker indexing
Fu, ZhongHua
[J]. MULTIMEDIA CONTENT ANALYSIS AND MINING, PROCEEDINGS, 2007, 4577 : 122 - +
[7] Redefining the Bayesian Information Criterion for Speaker Diarisation
Stafylakis, Themos
Katsouros, Vassilis
Carayannis, George
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1055 - 1058
[8] Rapid Unsupervised Speaker Adaptation Using Single Utterance Based on MLLR and Speaker Selection
Gomez, Randy
Toda, Tomoki
Saruwatari, Hiroshi
Shikano, Kiyohiro
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1365 - 1368
[9] Modeling speaker variation in cues to prominence using the Bayesian information criterion
Mahrt, Tim
Cole, Jennifer
Fleck, Margaret
Hasegawa-Johnson, Mark
[J]. PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 322 - 325
[10] A simple approach to unsupervised speaker indexing
Ofoegbu, Uchechukwu O.
Iyer, Ananth N.
Yantorno, Robert E.
Smolenski, Brett Y.
[J]. 2006 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS, VOLS 1 AND 2, 2006, : 315 - 318

← 1 2 3 4 5 →