Perceptual MVDR-based Unsupervised Built-in Speaker Normalization for Kazakh Speech Recognition

被引：0

作者：

Yessenbayev, Zhandos ^{[1
]}

Yapanel, Umit ^{[2
]}

机构：

[1] Nazarbayev Univ Res & Innovat Syst, Astana, Kazakhstan

[2] Yapanel Speech Technol, Sunnyvale, CA USA

来源：

2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT) | 2014年

关键词：

Unsupervised speaker normalization; Kazakh speech recognition; phone recognition;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this work we present a novel approach to unsupervised speaker normalization on top of the Perceptual MVDR-based Built-in Speaker Normalization technique. We showed that the proposed method can be efficient for the task of phonetic recognition on TIMIT and then applied it to Kazakh speech recognition. From the experiments, we see that this method is able to improve the relative performance of ASR systems up to 20%. The analysis of the optimal warp factor selection by the algorithm revealed a nice gender separation ability which may be used for gender/speaker classification tasks.

引用

页码：87 / 91

页数：5

共 50 条

[1] Perceptual MVDR-based cepstral coefficients(PMCCs)for speaker recognition
LIANG Chunyan ZHANG Xiang YANG Lin ZHANG Jianping YAN Yonghong (Key Laboratory of Speech Acoustics and Content Understanding
[J]. Chinese Journal of Acoustics, 2012, 31 (04) : 489 - 498
[2] Perceptual MVDR-Based Cepstral Coefficients (PMCCs) for Speaker Recognition
Liang, Chunyan
Zhang, Xiang
Yang, Lin
Zhang, Jianping
Yan, Yonghong
[J]. 2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 1386 - 1389
[3] Perceptual MVDR-based cepstral coefficients (PMCCs) for robust speech recognition
Yapanel, UH
Dharanipragada, S
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 644 - 647
[4] Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization
Umit H. Yapanel
John H.L. Hansen
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2008
[5] Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization
Yapanel, Umit H.
Hansen, John H. L.
[J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2008, 2008 (1)
[6] Towards an intelligent acoustic front-end for automatic speech recognition:built-in speaker normalization (BISN)
Yapanel, UH
Hansen, JHL
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 949 - 952
[7] Speaker normalization for template based speech recognition
Demange, Sebastien
Van Compernolle, Dirk
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 560 - 563
[8] A New Subband-Weighted MVDR-Based Front-End for Robust Speech Recognition
Seyedin, Sanaz
Ahadi, Seyed Mohammad
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (08): : 2252 - 2261
[9] A Study of Speech Recognition for Kazakh Based on Unsupervised Pre-Training
Meng, Weijing
Yolwas, Nurmemet
[J]. SENSORS, 2023, 23 (02)
[10] Model-based speaker normalization methods for speech recognition
Naito, M
Deng, L
Sagisaka, Y
[J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2003, 86 (02): : 45 - 56

← 1 2 3 4 5 →