Ensemble based speaker recognition using unsupervised data selection

被引：0

作者：

Huang, Chien-Lin ^{[1
]}

Wang, Jia-Ching ^{[1
]}

Ma, Bin ^{[2
]}

机构：

[1] Natl Cent Univ, Dept Comp Sci & Informat Engn, Taipei 32001, Taiwan

[2] Human Language Technol, Inst Infocomm Res I2R, Singapore 138632, Singapore

来源：

APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING | 2016年 / 5卷

关键词：

Speaker recognition; Ensemble classifier; Unsupervised data selection;

D O I：

10.1017/ATSIP.2016.10

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper presents an ensemble-based speaker recognition using unsupervised data selection. Ensemble learning is a type of machine learning that applies a combination of several weak learners to achieve an improved performance than a single learner. A speech utterance is divided into several subsets based on its acoustic characteristics using unsupervised data selection methods. The ensemble classifiers are then trained with these non-overlapping subsets of speech data to improve the recognition accuracy. This new approach has two advantages. First, without any auxiliary information, we use ensemble classifiers based on unsupervised data selection to make use of different acoustic characteristics of speech data. Second, in ensemble classifiers, we apply the divide-and-conquer strategy to avoid a local optimization in the training of a single classifier. Our experiments on the 2010 and 2008 NIST Speaker Recognition Evaluation datasets show that using ensemble classifiers yields a significant performance gain.

引用

页数：9

共 50 条

[1] Ensemble Classifiers Using Unsupervised Data Selection for Speaker Recognition
Huang, Chien-Lin
Hori, Chiori
Kashioka, Hideki
Ma, Bin
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2665 - +
[2] Multistage data selection-based unsupervised speaker adaptation for personalized speech emotion recognition
Kim, Jae-Bok
Park, Jeong-Sik
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 52 : 126 - 134
[3] Maximum Entropy based Data Selection for Speaker Recognition
Huang, Chien-Lin
Ma, Bin
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2724 - 2727
[4] Rapid Unsupervised Speaker Adaptation Using Single Utterance Based on MLLR and Speaker Selection
Gomez, Randy
Toda, Tomoki
Saruwatari, Hiroshi
Shikano, Kiyohiro
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1365 - 1368
[5] Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion
Nishida, M
Kawahara, T
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 172 - 175
[6] Unsupervised NAP Training Data Design for Speaker Recognition
Sun, Hanwu
Ma, Bin
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1098 - 1101
[7] ROBUST SPEAKER RECOGNITION USING UNSUPERVISED ADVERSARIAL INVARIANCE
Peri, Raghuveer
Pal, Monisankha
Jati, Arindam
Somandepalli, Krishna
Narayanan, Shrikanth
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6614 - 6618
[8] RANSAC-based Training Data Selection for Speaker State Recognition
Bozkurt, Elif
Erzin, Engin
Erdem, Cigdem Eroglu
Erdem, A. Tanju
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 3300 - +
[9] Robust speaker identification using combined feature selection and missing data recognition
Pullella, Daniel
Kuehne, Marco
Togneri, Roberto
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4833 - 4836
[10] A DISCRIMINATIVE UNSUPERVISED METHOD FOR SPEAKER RECOGNITION USING DEEP LEARNING
Saleem, Muhammad Muneeb
Hansen, John H. L.
2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,

← 1 2 3 4 5 →