Leveraging Speech Production Knowledge for Improved Speech Recognition

被引:1
|
作者
Sangwan, Abhijeet [1 ]
Hansen, John H. L. [1 ]
机构
[1] Univ Texas Dallas, Ctr Robust Speech Syst, Richardson, TX 75083 USA
关键词
MODELS; FEATURES;
D O I
10.1109/ASRU.2009.5373368
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study presents a novel phonological methodology for speech recognition based on phonological features (PFs) which leverages the relationship between speech phonology and phonetics. In particular, the proposed scheme estimates the likelihood of observing speech phonology given an associative lexicon. In this manner, the scheme is capable of choosing the most likely hypothesis (word candidate) among a group of competing alternative hypotheses. The framework employs the Maximum Entropy (ME) model to learn the relationship between phonetics and phonology. Subsequently, we extend the ME model to a ME-HMM (maximum entropy-hidden Markov model) which captures the speech production and linguistic relationship between phonology and words. The proposed ME-HMM model is applied to the task of re-processing N-best lists where an absolute WRA (word recognition rate) increase of 1.7%, 1.9% and 1% are reported for TIMIT, NTIMIT, and the SPINE (speech in noise) corpora (15.5% and 22.5% relative reduction in word error rate for TIMIT and NTIMIT).
引用
收藏
页码:58 / 63
页数:6
相关论文
共 50 条
  • [1] Speech production knowledge in automatic speech recognition
    King, Simon
    Frankel, Joe
    Livescu, Karen
    McDermott, Erik
    Richmond, Korin
    Wester, Mirjam
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2007, 121 (02): : 723 - 742
  • [2] LEVERAGING AUTOMATIC SPEECH RECOGNITION IN COCHLEAR IMPLANTS FOR IMPROVED SPEECH INTELLIGIBILITY UNDER REVERBERATION
    Hazrati, Oldooz
    Ghaffarzadegan, Shabnam
    Hansen, John H. L.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5093 - 5097
  • [3] Leveraging native language information for improved accented speech recognition
    Ghorbani, Shahram
    Hansen, John H. L.
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2449 - 2453
  • [4] Refining maritime Automatic Speech Recognition by leveraging synthetic speech
    Martius, Christoph
    Nakilcioglu, Emin Cagatay
    Reimann, Maximilian
    John, Ole
    [J]. MARITIME TRANSPORT RESEARCH, 2024, 7
  • [5] Speech production and automatic speech recognition
    [J]. 2000, Inst of Acoustics, St. Albans, Engl (25):
  • [6] THE USE OF SPEECH KNOWLEDGE IN AUTOMATIC SPEECH RECOGNITION
    ZUE, VW
    [J]. PROCEEDINGS OF THE IEEE, 1985, 73 (11) : 1602 - 1615
  • [7] Speech production parameters for automatic speech recognition
    McGowan, RS
    Faber, A
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1997, 101 (01): : 28 - 28
  • [8] Visual speech feature extraction for improved speech recognition
    Zhang, X
    Mersereau, RM
    Clements, M
    Broun, CC
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 1993 - 1996
  • [9] Using speech rhythm knowledge to improve dysarthric speech recognition
    Selouani, S. -A.
    Dahmani, H.
    Amami, R.
    Hamam, H.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (01) : 57 - 64
  • [10] AN INTEGRATED KNOWLEDGE BASE FOR SPEECH SYNTHESIS AND AUTOMATIC SPEECH RECOGNITION
    TATHAM, MAA
    [J]. JOURNAL OF PHONETICS, 1985, 13 (02) : 175 - 188