Subword analysis of small vocabulary and large vocabulary ASR for Punjabi language

被引:0
|
作者
Puneet Mittal
Navdeep Singh
机构
[1] BBSBEC,
[2] Mata Gujri College,undefined
关键词
Subword modeling; Pronunciation dictionary; WER; Acoustic modeling;
D O I
暂无
中图分类号
学科分类号
摘要
Modeling of words into phones should be done quite carefully, as these phones or sound units are used to build the acoustic model. Various techniques have been proposed for modeling the acoustic unit like phone, character, syllable, subword etc. Problem occurs when too many unique subwords/phones are generated in dictionary; it makes the automatic speech recognition process difficult. Various researchers have formulated diverse techniques to deal with it. In this paper, subword based dictionary has been explored for Punjabi language. For large vocabulary, number of subwords generated is quite more than the number permissible for computation. To reduce the number of subwords to be modeled, an algorithm has been proposed to replace least occurring subword with subword having similar sound. Acoustic model has been developed using the small and large vocabulary data. WER and size comparison has been done. Results reveal that large vocabulary models give high recognition rate having only 6% of WER.
引用
收藏
页码:71 / 78
页数:7
相关论文
共 50 条
  • [21] Vocabulary Independent Spoken Query: a Case for Subword Units
    Gouvea, Evandro
    Ezzat, Tony
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1680 - 1683
  • [22] Hybrid language models for out of vocabulary word detection in large vocabulary conversational speech recognition
    Yazgan, A
    Saraclar, M
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 745 - 748
  • [23] Large Vocabulary SOUL Neural Network Language Models
    Le, Hai-Son
    Oparin, Ilya
    Messaoudi, Abdel
    Allauzen, Alexandre
    Gauvain, Jean-Luc
    Yvon, Francois
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1480 - +
  • [24] Vocabulary Knowledge and Vocabulary Use in Second Language Writing
    Johnson, Mark D.
    Acevedo, Anthony
    Mercado, Leonardo
    TESOL JOURNAL, 2016, 7 (03): : 700 - 715
  • [25] Open vocabulary ASR for audiovisual document indexation
    Allauzen, A
    Gauvain, JL
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 1013 - 1016
  • [26] Strategies for Training Large Vocabulary Neural Language Models
    Chen, Wenlin
    Grangier, David
    Auli, Michael
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1975 - 1985
  • [27] Vocabulary input, vocabulary uptake and approaches to language teaching
    Milton, James
    Alexiou, Thomai
    LANGUAGE LEARNING JOURNAL, 2012, 40 (01): : 1 - 5
  • [28] THE VOCABULARY OF THE CHINOOK LANGUAGE
    Boas, Franz
    AMERICAN ANTHROPOLOGIST, 1904, 6 (01) : 118 - 147
  • [29] VOCABULARY OF THE KIOWA LANGUAGE
    Harrington, John P.
    BUREAU OF AMERICAN ETHNOLOGY BULLETIN, 1928, (84): : 1 - 255
  • [30] Vocabulary of the Sipaia Language
    Nimuendaju, Kurt
    ANTHROPOS, 1928, 23 (5-6) : 821 - 850