REGION DEPENDENT LINEAR TRANSFORMS IN MULTILINGUAL SPEECH RECOGNITION

被引:0
|
作者
Karafiat, Martin [1 ]
Janda, Milos [1 ]
Cernocky, Jan [1 ]
Burget, Lukas [1 ]
机构
[1] Brno Univ Technol, Speech FIT, Brno, Czech Republic
关键词
HLDA; Region Dependent Transforms; Minimum Phone Error; fMPE; multilingual speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In today's speech recognition systems, linear or nonlinear transformations are usually applied to post-process speech features forming input to HMM based acoustic models. In this work, we experiment with three popular transforms: HLDA, MPE-HLDA and Region Dependent Linear Transforms (RDLT), which are trained jointly with the acoustic model to extract maximum of the discriminative information from the raw features and to represent it in a form suitable for the following GMM-HMM based acoustic model. We focus on multi-lingual environments, where limited resources are available for training recognizers of many languages. Using data from GlobalPhone database, we show that, under such restrictive conditions, the feature transformations can be advantageously shared across languages and robustly trained using data from several languages.
引用
收藏
页码:4885 / 4888
页数:4
相关论文
共 50 条
  • [1] REGION DEPENDENT LINEAR TRANSFORMS IN MULTILINGUAL SPEECH RECOGNITION
    Karafiat, Martin
    Janda, Milos
    Cernocky, Jan
    Burget, Lukas
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4885 - 4888
  • [2] MULTILINGUAL REGION-DEPENDENT TRANSFORMS
    Karafiat, Martin
    Burget, Lukas
    Grezl, Frantisek
    Vesely, Karel
    Cernocky, Jan Honza
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5430 - 5434
  • [3] Discriminatively trained region dependent feature transforms for speech recognition
    Zhang, Bing
    Matsoukas, Spyros
    Schwartz, Richard
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 313 - 316
  • [4] LEARNING IMPROVED LINEAR TRANSFORMS FOR SPEECH RECOGNITION
    Senior, Andrew
    Cho, Youngmin
    Weston, Jason
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 1957 - 1960
  • [5] Predictive linear transforms for noise robust speech recognition
    Gales, M. J. F.
    van Dalen, R. C.
    [J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 59 - 64
  • [6] RECOGNITION OF MULTILINGUAL SPEECH IN MOBILE APPLICATIONS
    Lin, Hui
    Huang, Jui-ting
    Beaufays, Francoise
    Strope, Brian
    Sung, Yun-hsuan
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4881 - 4884
  • [7] Multilingual speech recognition in seven languages
    Uebler, U
    [J]. SPEECH COMMUNICATION, 2001, 35 (1-2) : 53 - 69
  • [8] Multilingual speech recognition for GlobalPhone languages
    Tachbelie, Martha Yifiru
    Abate, Solomon Teferra
    Schultz, Tanja
    [J]. SPEECH COMMUNICATION, 2022, 140 : 71 - 86
  • [9] Emotional Speech Recognition: A Multilingual Perspective
    Meftah, Ali
    Alotaibi, Yousef
    Selouani, Sid-Ahmed
    [J]. 2016 INTERNATIONAL CONFERENCE ON BIO-ENGINEERING FOR SMART TECHNOLOGIES (BIOSMART), 2016,
  • [10] RECOGNITION OF MULTILINGUAL SPEECH IN MOBILE APPLICATIONS
    Lin, Hui
    Huang, Jui-ting
    Beaufays, Francoise
    Strope, Brian
    Sung, Yun-hsuan
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4881 - 4884