System combination for improved automatic generation of N-best proper nouns pronunciation

被引:0
|
作者
Duncan, R [1 ]
机构
[1] Mississippi State Univ, Mississippi State, MS 39762 USA
关键词
D O I
10.1109/SECON.2001.923117
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Proper nouns present a challenging problem for current speech recognition technology since they often do not follow typical letter-to-sound conversion rules. Several different automated methods, Boltzmann machines, Decision Trees, and Recurrent Neural Networks have been attempted recently, yet no single system has achieved an acceptable error rate, Since the project goal is the generation of pronunciation dictionaries for speech recognition, however, we can easily combine the multiple outputs of the multiple systems and use the total database coverage as our scoring metric, For generating at least one correct pronunciation for all names, combining all systems gives us a 19.6% error rate, a 23.1% absolute reduction over the best previous system. For generating every pronunciation in the database the combined system rates at 29.1%, a 23.6% reduction.
引用
收藏
页码:208 / 212
页数:5
相关论文
共 32 条
  • [1] Automated generation of N-best pronunciations of proper nouns
    Deshmukh, N
    Weber, M
    Picone, J
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 283 - 286
  • [2] Improving pronunciation inference using n-best list, acoustics and orthography
    Anumanchipalli, Gopala Krishna
    Ravishankar, Mosur
    Reddy, Raj
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 925 - +
  • [3] Improved N-Best Extraction with an Evaluation on Language Data
    Bjoerklund, Johanna
    Drewes, Frank
    Jonsson, Anna
    [J]. COMPUTATIONAL LINGUISTICS, 2022, 48 (01) : 119 - 153
  • [4] On the Optimization Problems for the Proper Generalized Decomposition and the n-Best Term Approximation
    Falco, A.
    [J]. PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE ON ENGINEERING COMPUTATIONAL TECHNOLOGY, 2010, 94
  • [5] Morpho-syntactic post-processing of N-best lists for improved French automatic speech recognition
    Huet, Stephane
    Gravier, Guillaume
    Sebillot, Pascale
    [J]. COMPUTER SPEECH AND LANGUAGE, 2010, 24 (04): : 663 - 684
  • [6] Automatic acoustic segmentation in N-best list rescoring for lecture speech recognition
    Shen, Peng
    Lu, Xugang
    Kawai, Hisashi
    [J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [7] Semantic Features Based N-Best Rescoring Methods for Automatic Speech Recognition
    Liu, Chang
    Zhang, Pengyuan
    Li, Ta
    Yan, Yonghong
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (23):
  • [8] Morphosyntactic Processing of N-Best Lists for Improved Recognition and Confidence Measure Computation
    Huet, Stephane
    Gravier, Guillaume
    Sebillot, Pascale
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1989 - 1992
  • [9] N-best tokenization in a GMM-SVM language identification system
    Yang, Xi
    Siu, Manhung
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1005 - +
  • [10] The ESAT 2008 System for N-Best Dutch Speech Recognition Benchmark
    Demuynck, Kris
    Puurula, Antti
    Van Compernolle, Dirk
    Wambacq, Patrick
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 339 - 344