CROSS-LINGUAL CONTEXT SHARING AND PARAMETER-TYING FOR MULTI-LINGUAL SPEECH RECOGNITION

被引:0
|
作者
Mohan, Aanchan [1 ]
Rose, Richard [1 ]
机构
[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ, Canada
关键词
Low-resource speech recognition; Subspace Methods; Multi-lingual speech recognition; Semi-tied Covariances; Indian languages;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper is concerned with the problem of building acoustic models for automatic speech recognition (ASR) using speech data from multiple languages. Techniques for multi-lingual ASR are developed in the context of the subspace Gaussian mixture model (SGMM)[2, 3]. Multi-lingual SGMM based ASR systems have been configured with shared subspace parameters trained from multiple languages but with distinct language dependent phonetic contexts and states[11, 12]. First, an approach for sharing state-level target language and foreign language SGMM parameters is described. Second, semi-tied covariance transformations are applied as an alternative to full-covariance Gaussians to make acoustic model training less sensitive to issues of insufficient training data. These techniques are applied to Hindi and Marathi language data obtained for an agricultural commodities dialog task in multiple Indian languages.
引用
收藏
页码:126 / 131
页数:6
相关论文
共 50 条
  • [1] Multi-lingual and Cross-lingual timeline extraction
    Laparra, Egoitz
    Agerri, Rodrigo
    Aldabe, Itziar
    Rigau, German
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 133 : 77 - 89
  • [2] Towards Unifying Multi-Lingual and Cross-Lingual Summarization
    Wang, Jiaan
    Meng, Fandong
    Zheng, Duo
    Liang, Yunlong
    Li, Zhixu
    Qu, Jianfeng
    Zhou, Jie
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15127 - 15143
  • [3] Optimal trained ensemble of classification model for speech emotion recognition: Considering cross-lingual and multi-lingual scenarios
    Rupali Ramdas Kawade
    Sonal K. Jagtap
    [J]. Multimedia Tools and Applications, 2024, 83 : 54331 - 54365
  • [4] Optimal trained ensemble of classification model for speech emotion recognition: Considering cross-lingual and multi-lingual scenarios
    Kawade, Rupali Ramdas
    Jagtap, Sonal K.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (18) : 54331 - 54365
  • [5] Detecting Hate Speech in Cross-Lingual and Multi-lingual Settings Using Language Agnostic Representations
    Rodriguez, Sebastian E.
    Allende-Cid, Hector
    Allende, Hector
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2021, 2021, 12702 : 77 - 87
  • [6] Speech Emotion Recognition with Cross-lingual Databases
    Chiou, Bo-Chang
    Chen, Chia-Ping
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 558 - 561
  • [7] SERAB: A MULTI-LINGUAL BENCHMARK FOR SPEECH EMOTION RECOGNITION
    Scheidwasser-Clow, Neil
    Kegler, Mikolaj
    Beckmann, Pierre
    Cernak, Milos
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7697 - 7701
  • [8] IMPROVING LUXEMBOURGISH SPEECH RECOGNITION WITH CROSS-LINGUAL SPEECH REPRESENTATIONS
    Le Minh Nguyen
    Nayak, Shekhar
    Coler, Matt
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 792 - 797
  • [9] Semantic speech recognition in the Basque context Part I: cross-lingual approaches
    Barroso, Nora
    Lopez de Ipina, Karmele
    Barroso, Odei
    Ezeiza, Aitzol
    Hernandez, Carmen
    Grana, Manuel
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (01) : 33 - 40
  • [10] Semantic speech recognition in the Basque context Part I: cross-lingual approaches
    Nora Barroso
    Karmele López de Ipiña
    Odei Barroso
    Aitzol Ezeiza
    Carmen Hernández
    Manuel Graña
    [J]. International Journal of Speech Technology, 2012, 15 (1) : 33 - 40