Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion

被引：0

作者：

Nishida, M ^{[1
]}

Kawahara, T ^{[1
]}

机构：

[1] JST, PRESTO, Sakyo Ku, Kyoto 6068501, Japan

来源：

2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I | 2003年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper addresses unsupervised speaker indexing for discussion Audio archives. In discussions, the speaker changes frequently, thus the duration of utterances is very short and its variation is large, which causes significant problems in applying conventional methods such as model adaptation and Variance-BIC (Bayesian Information Criterion) methods. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the BIC according to the duration of utterances. When the speech segment is short, the simple and robust VQ-based method is expected to be chosen, while GMM will be reliably trained for long segments. For a discussion archive having a total duration of 10 hours; it is demonstrated that the proposed method achieves higher indexing performance than that of conventional methods.

引用

页码：172 / 175

页数：4

共 50 条

[21] Ensemble Classifiers Using Unsupervised Data Selection for Speaker Recognition
Huang, Chien-Lin
Hori, Chiori
Kashioka, Hideki
Ma, Bin
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2665 - +
[22] Speaker verification based on speaker background model virtually synthesized using local acoustic information
Isobe, T
Takahashi, J
Nakamura, T
ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2002, 85 (04): : 47 - 57
[23] A study of generic models for unsupervised on-line speaker indexing
Kwon, S
Narayanan, S
ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 423 - 428
[24] UBM based speaker selection and model re-estimation for speaker adaptation
Wang, Jian
Guo, Jun
Liu, Gang
Lei, Jianjun
PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 856 - 860
[25] Two Step Speaker Segmentation Method Using Bayesian Information Criterion and Adapted Gaussian Mixtures Models
Grasic, Matej
Kos, Marko
Zgank, Andrej
Kacic, Zdravko
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2514 - 2517
[26] An Unsupervised Bayesian Classifier for Multiple Speaker Detection and Localization
Oualil, Youssef
Faubel, Friedrich
Klakow, Dietrich
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2942 - 2946
[27] Unsupervised model adaptation for speaker verification
Preti, Alexandre
Bonastre, Jean-Francois
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2090 - 2093
[28] SVM based speaker selection using GMM supervector for rapid speaker adaptation
Wang, Jian
Lei, Jianjun
Guo, Jun
Yang, Zhen
SIMULATED EVOLUTION AND LEARNING, PROCEEDINGS, 2006, 4247 : 617 - 624
[29] Feature Selection Based on Information Theory for Speaker Verification
Fernandez, Rafael
Bonastre, Jean-Francois
Matrouf, Driss
Calvo, Jose R.
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, PROCEEDINGS, 2009, 5856 : 305 - +
[30] Improving Unsupervised Acoustic Word Embeddings using Speaker and Gender Information
van Staden, Lisa
Kamper, Herman
2020 INTERNATIONAL SAUPEC/ROBMECH/PRASA CONFERENCE, 2020, : 533 - 538

← 1 2 3 4 5 →