Large vocabulary continuous speech recognition of an inflected language using stems and endings

被引:24
|
作者
Rotovnik, Tomaz [1 ]
Maucec, Mirjam Sepesy [1 ]
Kacic, Zdravko [1 ]
机构
[1] Univ Maribor, Fac Elect Engn & Comp Sci, Smetanova 17, SLO-2000 Maribor, Slovenia
关键词
large vocabulary continuous speech recognition; sub-word modeling; search algorithm; stem; ending;
D O I
10.1016/j.specom.2007.02.010
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this article, we focus on creating a large vocabulary speech recognition system for the Slovenian language. Currently, state-of-heart recognition systems are able to use vocabularies with sizes of 20,000 to 100,000 words. These systems have mostly been developed for English, which belongs to a group of uninflectional languages. Slovenian, as a Slavic language, belongs to a group of inflectional languages. Its rich morphology presents a major problem in large vocabulary speech recognition. Compared to English, the Slovenian language requires a vocabulary approximately 10 times greater for the same degree of text coverage. Consequently, the difference in vocabulary size causes a high degree of OOV (out-of-vocabulary words). Therefore OOV words have a direct impact on recognizer efficiency. The characteristics of inflectional languages have been considered when developing a new search algorithm with a method for restricting the correct order of sub-word units, and to use separate language models based on sub-words. This search algorithm combines the properties of sub-word-based models (reduced OOV) and word-based models (the length of context). The algorithm also enables better search-space limitation for sub-word models. Using sub-word models, we increase recognizer accuracy and achieve a comparable search space to that of a standard word-based recognizer. Our methods were evaluated in experiments on a SNABI speech database. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:437 / 452
页数:16
相关论文
共 50 条
  • [41] Parallel Scalability in Speech Recognition Inference engines in large vocabulary continuous speech recognition
    You, Kisun
    Chong, Jike
    Yi, Youngmin
    Gonina, Ekaterina
    Hughes, Christopher J.
    Chen, Yen-Kuang
    Sung, Wonyong
    Keutzer, Kurt
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2009, 26 (06) : 124 - 135
  • [42] Language modelling approaches for Turkish large vocabulary continuous speech recognition based on lattice rescoring
    Arisoy, Ebru
    Saraclar, Murat
    [J]. 2006 IEEE 14th Signal Processing and Communications Applications, Vols 1 and 2, 2006, : 579 - 582
  • [43] Continuous Mandarin speech recognition for Chinese language with large vocabulary based on segmental probability model
    Shen, JL
    [J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1998, 145 (05): : 309 - 315
  • [44] A review of large-vocabulary continuous-speech recognition
    Young, S
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 1996, 13 (05) : 45 - 57
  • [45] Feature selection in mandarin large vocabulary continuous speech recognition
    Zhu, X
    Chen, YN
    Liu, J
    Liu, RS
    [J]. 2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 508 - 511
  • [46] A LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION SYSTEM WITH HIGH PREDICTABILITY
    SHIGENAGA, M
    SEKIGUCHI, Y
    YAMAGUCHI, T
    MASUDA, R
    [J]. IEICE TRANSACTIONS ON COMMUNICATIONS ELECTRONICS INFORMATION AND SYSTEMS, 1991, 74 (07): : 1817 - 1825
  • [47] A Segmental CRF Approach to Large Vocabulary Continuous Speech Recognition
    Zweig, Geoffrey
    Nguyen, Patrick
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 152 - 157
  • [48] Language identification through large vocabulary continous speech recognition
    Lim, BP
    Li, HZ
    Chen, Y
    [J]. 2004 INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2004, : 49 - 52
  • [49] Large vocabulary speech recognition with multispan statistical language models
    Bellegarda, JR
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (01): : 76 - 84
  • [50] DISTRIBUTED SUBMODULAR MAXIMIZATION FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Qi, Jun
    Liu, Xu
    Kamijo, Shunshuke
    Tejedor, Javier
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2501 - 2505