Combined optimisation of baseforms and model parameters in speech recognition based on acoustic subword units

被引:7
|
作者
Holter, T [1 ]
Svendsen, T [1 ]
机构
[1] Norwegian Univ Sci & Technol, Dept Telecommun, N-7034 Trondheim, Norway
关键词
D O I
10.1109/ASRU.1997.659006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A major challenge in speech recognition is creating a lexicon which is robust to inter-and intra-speaker variations. This is even more so in speech recognisers based on non-linguistic units, e.g. acoustic subword units (ASWUs), since no standard pronunciation dictionaries are available. Thus the base forms describing the vocabulary words in terms of the recognition units need to be generated from training data. In this paper we propose an algorithm for ASWU-based speech recognition which performs a combined optimisation of the baseforms and the subword models. The resulting system has been tested on the DARPA Resource Management task, and is shown to perform comparable to a baseline phoneme based system.
引用
收藏
页码:199 / 206
页数:8
相关论文
共 50 条
  • [41] Speech and melody recognition in binaurally combined acoustic and electric hearing
    Kong, YY
    Stickney, GS
    Zeng, FG
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2005, 117 (03): : 1351 - 1361
  • [42] End-to-End Speech Emotion Recognition Combined with Acoustic-to-Word ASR Model
    Feng, Han
    Ueno, Sei
    Kawahara, Tatsuya
    [J]. INTERSPEECH 2020, 2020, : 501 - 505
  • [43] Enhanced Automatic Speech Recognition with Non-acoustic Parameters
    Sreekanth, N. S.
    Narayanan, N. K.
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL, NETWORKS, COMPUTING, AND SYSTEMS (ICSNCS 2016), VOL 1, 2017, 395 : 93 - 104
  • [44] Parkinson's Disease Recognition by Speech Acoustic Parameters Classification
    Meghraoui, D.
    Boudraa, B.
    Merazi-Meksen, T.
    Boudraa, M.
    [J]. MODELLING AND IMPLEMENTATION OF COMPLEX SYSTEMS, MISC 2016, 2016, : 165 - 173
  • [45] Modeling the Temporal Evolution of Acoustic Parameters for Speech Emotion Recognition
    Ntalampiras, Stavros
    Fakotakis, Nikos
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2012, 3 (01) : 116 - 125
  • [46] A Bayesian view on acoustic model-based techniques for robust speech recognition
    Maas, Roland
    Huemmer, Christian
    Sehr, Armin
    Kellermann, Walter
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015, : 1 - 16
  • [47] A Bayesian view on acoustic model-based techniques for robust speech recognition
    Roland Maas
    Christian Huemmer
    Armin Sehr
    Walter Kellermann
    [J]. EURASIP Journal on Advances in Signal Processing, 2015
  • [48] Split-lexicon based hierarchical recognition of speech using syllable and word level acoustic units
    Sethy, A
    Narayanan, S
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 772 - 775
  • [49] Acoustic Model Adaptation for Emotional Speech Recognition Using Twitter-Based Emotional Speech Corpus
    Kosaka, Tetsuo
    Aizawa, Yoshitaka
    Kato, Masaharu
    Nose, Takashi
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1747 - 1751
  • [50] Crosslingual acoustic model development for automatic speech recognition
    Diehl, Frank
    Moreno, Asuncion
    Monte, Enric
    [J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 425 - 430