Discriminative Training of n-gram Language Models for Speech Recognition via Linear Programming

被引:2
|
作者
Magdin, Vladimir [1 ]
Jiang, Hui [1 ]
机构
[1] York Univ, Dept Comp Sci & Engn, Toronto, ON M3J 1P3, Canada
关键词
D O I
10.1109/ASRU.2009.5373248
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel discriminative training algorithm for n-gram language models for use in large vocabulary continuous speech recognition. The algorithm uses Maximum Mutual Information Estimation (MMIE) to build an objective function that involves a metric computed between correct transcriptions and their competing hypotheses, which are encoded as word graphs generated from the Viterbi decoding process. The nonlinear MMIE objective function is approximated by a linear one using an EM-style auxiliary function, thus converting the discriminative training of n-gram language models into a linear programing problem, which can be efficiently solved by many convex optimization tools. Experimental results on the SPINE! speech recognition corpus have shown that the proposed discriminative training method can outperform the conventional discounting-based maximum likelihood estimation methods. A relative reduction in word error rate of close to 3% has been observed on the SPINE1 speech recognition task.
引用
收藏
页码:305 / 310
页数:6
相关论文
共 50 条
  • [1] LARGE MARGIN ESTIMATION OF N-GRAM LANGUAGE MODELS FOR SPEECH RECOGNITION VIA LINEAR PROGRAMMING
    Magdin, Vladimir
    Jiang, Hui
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5398 - 5401
  • [2] Constrained Discriminative Training of N-gram Language Models
    Rastrow, Ariya
    Sethy, Abhinav
    Ramabhadran, Bhuvana
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 311 - +
  • [3] Investigation on LSTM Recurrent N-gram Language Models for Speech Recognition
    Tueske, Zoltan
    Schlueter, Ralf
    Ney, Hermann
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3358 - 3362
  • [4] Discriminative n-gram language modeling
    Roark, Brian
    Saraclar, Murat
    Collins, Michael
    [J]. COMPUTER SPEECH AND LANGUAGE, 2007, 21 (02): : 373 - 392
  • [5] Discriminative training of language models for speech recognition
    Kuo, KHJ
    Fosler-Lussier, E
    Jiang, H
    Lee, CH
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 325 - 328
  • [6] Improved N-gram Phonotactic Models For Language Recognition
    BenZeghiba, Mohamed Faouzi
    Gauvain, Jean-Luc
    Lamel, Lori
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2718 - 2721
  • [7] Discriminative N-gram Selection for Dialect Recognition
    Richardson, F. S.
    Campbell, W. M.
    Torres-Carrasquillo, P. A.
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 192 - 195
  • [8] Discriminative N-gram Language Modeling for Turkish
    Arisoy, Ebru
    Roark, Brian
    Shafran, Izhak
    Saraclar, Murat
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 825 - +
  • [9] N-gram language models for offline handwritten text recognition
    Zimmermann, M
    Bunke, H
    [J]. NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 203 - 208
  • [10] Language modeling by string pattern N-gram for Japanese speech recognition
    Ito, A
    Kohda, M
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 490 - 493