Syllable Based Language Model for Large Vocabulary Continuous Speech Recognition of Polish

被引：0

作者：

Majewski, Piotr ^{[1
]}

机构：

[1] Univ Lodz, Fac Math & Comp Sci, PL-90238 Lodz, Poland

来源：

TEXT, SPEECH AND DIALOGUE, PROCEEDINGS | 2008年 / 5246卷

关键词：

Polish; large vocabulary continuous speech recognition; language modeling; sub-word units; syllable-based units;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most of state-of-the-art large vocabulary continuous speech recognition systems use word-based n-gram language models. Such models are not optimal solution for inflectional or agglutinative languages. The Polish language is highly inflectional one and requires a very large corpora to create a sufficient language model with the small out-of-vocabulary ratio. We propose a syllable-based language model. which is better suited to highly inflectional language like Polish. In case of lack of resources (i.e. small corpora) syllable-based model outperforms word-based models in terms of number of out-of-vocabulary units (syllables in our model). Such model is an approximation of the morphene-based model for Polish. In our paper, we show results of evaluation of syllable based model and its usefulness in speech recognition tasks.

引用

页码：397 / 401

页数：5

共 50 条

[1] Syllable based language model for large vocabulary continuous speech recognition of Uyghur
[J]. Silamu, W. (wushour@xju.edu.cn), 1600, Tsinghua University (53):
[2] Syllable-based large vocabulary continuous speech recognition
Ganapathiraju, A
Hamaker, J
Picone, J
Ordowski, M
Doddington, GR
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (04): : 358 - 366
[3] Development of Large Vocabulary Continuous Speech Recognition for Polish
Demenko, G.
Szymanski, M.
Cecko, R.
Kusmierek, E.
Lange, M.
Wegner, K.
Klessa, K.
Owsianny, M.
[J]. ACTA PHYSICA POLONICA A, 2012, 121 (1A) : A86 - A91
[4] A unified language model for large vocabulary continuous speech recognition of Turkish
Arisoy, Ebru
Dutagaci, Helin
Arslan, Levent M.
[J]. SIGNAL PROCESSING, 2006, 86 (10) : 2844 - 2862
[5] Continuous Mandarin speech recognition for Chinese language with large vocabulary based on segmental probability model
Shen, JL
[J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1998, 145 (05): : 309 - 315
[6] Connectionist language modeling for large vocabulary continuous speech recognition
Schwenk, H
Gauvain, JL
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 765 - 768
[7] A large vocabulary continuous speech recognition system for Persian language
Hossein Sameti
Hadi Veisi
Mohammad Bahrani
Bagher Babaali
Khosro Hosseinzadeh
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2011
[8] A large vocabulary continuous speech recognition system for Persian language
Sameti, Hossein
Veisi, Hadi
Bahrani, Mohammad
Babaali, Bagher
Hosseinzadeh, Khosro
[J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2011, : 1 - 12
[9] A usage of the syllable unit based on morphological statistics in Korean large vocabulary continuous speech recognition system
Ri, Hyok-Chol
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (04) : 971 - 977
[10] A usage of the syllable unit based on morphological statistics in Korean large vocabulary continuous speech recognition system
Hyok-Chol Ri
[J]. International Journal of Speech Technology, 2019, 22 : 971 - 977

← 1 2 3 4 5 →