Discriminative speaker adaptation in Persian continuous speech recognition systems

被引：3

作者：

Pirhosseinloo, Shadi ^{[1
]}

Ganj, Farshad Almas ^{[1
]}

机构：

[1] Amirkabir Univ Technol, Biomed Engn Depratment, Tehran, Iran

来源：

4TH INTERNATIONAL CONFERENCE OF COGNITIVE SCIENCE | 2012年 / 32卷

关键词：

Speech recognition; discriminative training; speaker adaptation; discrminative linear transform; minimum phone error;

D O I：

10.1016/j.sbspro.2012.01.043

中图分类号：

B849 [应用心理学];

学科分类号：

040203 ;

摘要：

In this paper, the use of discriminative criteria such as minimum phone error (MPE) and maximum mutual information (MMI) is investigated for discriminative training HMM models for Persian speech recognition system. Discriminative training criteria have been successfully used to train acoustic models, so these criteria are expected to improve the estimation of linear transforms for speaker adaptation. MPE criterion is used to estimate the discriminative linear transforms (DLTs) for mean transforms. Experiments on Farsdat corpus show considerable improvements of discriminative training against ML trained models and MPE training outperforms MMI training on test data. Furthermore, MPE-based DLT reduces the word error rate in comparison to MLLR adaptation. (C) 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of the 4th International Conference of Cognitive Science

引用

页码：296 / 301

页数：6

共 50 条

[1] Noise and speaker robustness in a Persian continuous speech recognition system
Veisi, Hadi
Sameti, Hossein
[J]. 2007 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1-3, 2007, : 73 - 76
[2] A speaker clustering algorithm for fast speaker adaptation in continuous speech recognition
Rodríguez, LJ
Torres, MI
[J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 433 - 440
[3] Speaker adaptation by modeling the speaker variation in a continuous speech recognition system
Strom, N
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 989 - 992
[4] Speaker clustering and transformation for speaker adaptation in speech recognition systems
Padmanabhan, M
Bahl, LR
Nahamoo, D
Picheny, MA
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (01): : 71 - 77
[5] Speaker adaptation in the philips system for large vocabulary continuous speech recognition
Thelen, E
Aubert, X
Beyerlein, P
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1035 - 1038
[6] Comparing Speaker Adaptation Methods for Visual Speech Recognition for Continuous Spanish
Gimeno-Gomez, David
Martinez-Hinarejos, Carlos-D.
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (11):
[7] Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems
Siniscalchi, Sabato Marco
Li, Jinyu
Lee, Chin-Hui
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10): : 2152 - 2161
[8] Speaker adaptation for hybrid MMI/connectionist speech recognition systems
Rottland, J
Neukirchen, C
Rigoll, G
[J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 465 - 468
[9] BAYESIAN DISCRIMINATIVE ADAPTATION FOR SPEECH RECOGNITION
Raut, C. K.
Gales, M. J. F.
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4361 - 4364
[10] FAST SPEAKER ADAPTATION OF HYBRID NN/HMM MODEL FOR SPEECH RECOGNITION BASED ON DISCRIMINATIVE LEARNING OF SPEAKER CODE
Abdel-Hamid, Ossama
Jiang, Hui
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7942 - 7946

← 1 2 3 4 5 →