Discriminative Training of n-gram Language Models for Speech Recognition via Linear Programming

被引：2

作者：

Magdin, Vladimir ^{[1
]}

Jiang, Hui ^{[1
]}

机构：

[1] York Univ, Dept Comp Sci & Engn, Toronto, ON M3J 1P3, Canada

来源：

2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009) | 2009年

关键词：

D O I：

10.1109/ASRU.2009.5373248

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a novel discriminative training algorithm for n-gram language models for use in large vocabulary continuous speech recognition. The algorithm uses Maximum Mutual Information Estimation (MMIE) to build an objective function that involves a metric computed between correct transcriptions and their competing hypotheses, which are encoded as word graphs generated from the Viterbi decoding process. The nonlinear MMIE objective function is approximated by a linear one using an EM-style auxiliary function, thus converting the discriminative training of n-gram language models into a linear programing problem, which can be efficiently solved by many convex optimization tools. Experimental results on the SPINE! speech recognition corpus have shown that the proposed discriminative training method can outperform the conventional discounting-based maximum likelihood estimation methods. A relative reduction in word error rate of close to 3% has been observed on the SPINE1 speech recognition task.

引用

页码：305 / 310

页数：6

共 50 条

[1] LARGE MARGIN ESTIMATION OF N-GRAM LANGUAGE MODELS FOR SPEECH RECOGNITION VIA LINEAR PROGRAMMING
Magdin, Vladimir
Jiang, Hui
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5398 - 5401
[2] Constrained Discriminative Training of N-gram Language Models
Rastrow, Ariya
Sethy, Abhinav
Ramabhadran, Bhuvana
[J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 311 - +
[3] Investigation on LSTM Recurrent N-gram Language Models for Speech Recognition
Tueske, Zoltan
Schlueter, Ralf
Ney, Hermann
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3358 - 3362
[4] Discriminative n-gram language modeling
Roark, Brian
Saraclar, Murat
Collins, Michael
[J]. COMPUTER SPEECH AND LANGUAGE, 2007, 21 (02): : 373 - 392
[5] Discriminative training of language models for speech recognition
Kuo, KHJ
Fosler-Lussier, E
Jiang, H
Lee, CH
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 325 - 328
[6] Improved N-gram Phonotactic Models For Language Recognition
BenZeghiba, Mohamed Faouzi
Gauvain, Jean-Luc
Lamel, Lori
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2718 - 2721
[7] Discriminative N-gram Selection for Dialect Recognition
Richardson, F. S.
Campbell, W. M.
Torres-Carrasquillo, P. A.
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 192 - 195
[8] Discriminative N-gram Language Modeling for Turkish
Arisoy, Ebru
Roark, Brian
Shafran, Izhak
Saraclar, Murat
[J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 825 - +
[9] N-gram language models for offline handwritten text recognition
Zimmermann, M
Bunke, H
[J]. NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 203 - 208
[10] Language modeling by string pattern N-gram for Japanese speech recognition
Ito, A
Kohda, M
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 490 - 493

← 1 2 3 4 5 →