Perceptual MVDR-based Unsupervised Built-in Speaker Normalization for Kazakh Speech Recognition

被引:0
|
作者
Yessenbayev, Zhandos [1 ]
Yapanel, Umit [2 ]
机构
[1] Nazarbayev Univ Res & Innovat Syst, Astana, Kazakhstan
[2] Yapanel Speech Technol, Sunnyvale, CA USA
关键词
Unsupervised speaker normalization; Kazakh speech recognition; phone recognition;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this work we present a novel approach to unsupervised speaker normalization on top of the Perceptual MVDR-based Built-in Speaker Normalization technique. We showed that the proposed method can be efficient for the task of phonetic recognition on TIMIT and then applied it to Kazakh speech recognition. From the experiments, we see that this method is able to improve the relative performance of ASR systems up to 20%. The analysis of the optimal warp factor selection by the algorithm revealed a nice gender separation ability which may be used for gender/speaker classification tasks.
引用
收藏
页码:87 / 91
页数:5
相关论文
共 50 条
  • [1] Perceptual MVDR-based cepstral coefficients(PMCCs)for speaker recognition
    LIANG Chunyan ZHANG Xiang YANG Lin ZHANG Jianping YAN Yonghong (Key Laboratory of Speech Acoustics and Content Understanding
    [J]. Chinese Journal of Acoustics, 2012, 31 (04) : 489 - 498
  • [2] Perceptual MVDR-Based Cepstral Coefficients (PMCCs) for Speaker Recognition
    Liang, Chunyan
    Zhang, Xiang
    Yang, Lin
    Zhang, Jianping
    Yan, Yonghong
    [J]. 2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 1386 - 1389
  • [3] Perceptual MVDR-based cepstral coefficients (PMCCs) for robust speech recognition
    Yapanel, UH
    Dharanipragada, S
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 644 - 647
  • [4] Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization
    Umit H. Yapanel
    John H.L. Hansen
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2008
  • [5] Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization
    Yapanel, Umit H.
    Hansen, John H. L.
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2008, 2008 (1)
  • [6] Towards an intelligent acoustic front-end for automatic speech recognition:built-in speaker normalization (BISN)
    Yapanel, UH
    Hansen, JHL
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 949 - 952
  • [7] Speaker normalization for template based speech recognition
    Demange, Sebastien
    Van Compernolle, Dirk
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 560 - 563
  • [8] A New Subband-Weighted MVDR-Based Front-End for Robust Speech Recognition
    Seyedin, Sanaz
    Ahadi, Seyed Mohammad
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (08): : 2252 - 2261
  • [9] A Study of Speech Recognition for Kazakh Based on Unsupervised Pre-Training
    Meng, Weijing
    Yolwas, Nurmemet
    [J]. SENSORS, 2023, 23 (02)
  • [10] Model-based speaker normalization methods for speech recognition
    Naito, M
    Deng, L
    Sagisaka, Y
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2003, 86 (02): : 45 - 56