Combining multiple-sized sub-word units in a speech recognition system using baseform selection

被引:0
|
作者
Nagarajan, T. [1 ]
Vijayalakshmi, P. [1 ]
O'Shaughnessy, Douglas [1 ]
机构
[1] Univ Quebec, INRS EMT, Montreal, PQ H3C 3P8, Canada
关键词
speech recognition; baseform selection; syllable;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A Longer-sized sub-word unit is known to be a better candidate in the development of a continuous speech recognition system. However, the basic problem with such units is the data sparsity. To overcome this problem, researchers have tried to combine longer-sized sub-word unit models with phoneme models. In this paper, we have considered only frequently occurring syllables and VC (Vowel + Consonant) units, and phone-sized units (monophones and triphones) for the development of a continuous speech recognition system. In such a case, even for a single pronunciation of a word, there can be multiple representational baseforms in the lexicon, each with different-sized units. We show that a considerable improvement in recognition performance can be achieved if the baseforms are selected properly. Out of all possible baseforms for a given word in the lexicon, the baseform that maximizes the acoustic likelihood, for possible sub-word unit concatenations to make a word, alone is considered. In the baseline systems' word-lexicon, like pure monophone or triphone-based systems, since only the acoustically weaker baseforms are replaced by baseforms with longer-sized units, the resultant performance is guaranteed to be better than that of baseline systems. The preliminary experiments carried out on the TIMIT speech corpus show a considerable improvement in the recognition performance over a pure monophone/triphone-based systems when the larger-sized units are combined using proper selection of baseforms.
引用
收藏
页码:1595 / 1597
页数:3
相关论文
共 46 条
  • [1] A neural network using acoustic sub-word units for continuous speech recognition
    Yu, HJ
    Oh, YH
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 506 - 509
  • [2] Speech recognition using sub-word units dependent on phonetic contexts of both training and recognition vocabularies
    Hattori, H
    Yamada, E
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2309 - 2312
  • [3] Combining Semantic Word Classes and Sub-Word Unit Speech Recognition for Robust OOV Detection
    Horndasch, Axel
    Batliner, Anton
    Kaufhold, Caroline
    Noeth, Elmar
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1335 - 1339
  • [4] Word/sub-word lattices decomposition and combination for speech recognition
    Le, Viet-Bac
    Seng, Sopheap
    Besacier, Laurent
    Bigi, Brigitte
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4321 - 4324
  • [5] Experiments for the Selection of Sub-word Units in the Basque Context
    Barroso, Nora
    Lopez de Ipina, Karmele
    Grana, Manuel
    Hernandez, Carmen
    [J]. SOFT COMPUTING MODELS IN INDUSTRIAL AND ENVIRONMENTAL APPLICATIONS, 6TH INTERNATIONAL CONFERENCE SOCO 2011, 2011, 87 : 495 - 504
  • [6] Incorporating language constraints in sub-word based speech recognition
    Erdogan, H
    Büyük, O
    Oflazer, K
    [J]. 2005 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2005, : 98 - +
  • [7] Natural Sounding Sub-word Units Concatenation in Malay Speech Synthesis
    Tiun, Sabrina
    Abdullah, Rosni
    Kong, Tang Enya
    [J]. PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON SIGNAL ACQUISITION AND PROCESSING, 2009, : 77 - +
  • [8] Arabic literal amount sub-word recognition using multiple features and classifiers
    Ahmad, Irfan
    Awaida, Sameh
    Mahmoud, Sabri A.
    [J]. INTERNATIONAL JOURNAL OF APPLIED PATTERN RECOGNITION, 2020, 6 (02) : 103 - 123
  • [9] Experiments for the selection of sub-word units in the Basque context for semantic tasks
    Nora Barroso
    Karmele López de Ipiña
    Carmen Hernández
    Aitzol Ezeiza
    Manuel Graña
    [J]. International Journal of Speech Technology, 2012, 15 (1) : 49 - 56
  • [10] Experiments for the selection of sub-word units in the Basque context for semantic tasks
    Barroso, Nora
    de Ipina, Karmele Lopez
    Hernandez, Carmen
    Ezeiza, Aitzol
    Grana, Manuel
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (01) : 49 - 56