IMPROVING LUXEMBOURGISH SPEECH RECOGNITION WITH CROSS-LINGUAL SPEECH REPRESENTATIONS

被引:0
|
作者
Le Minh Nguyen [1 ]
Nayak, Shekhar [1 ]
Coler, Matt [1 ]
机构
[1] Univ Groningen, Groningen, Netherlands
关键词
Luxembourgish; multilingual speech recognition; language modelling; wav2vec; 2.0; XLSR-53; under-resourced language;
D O I
10.1109/SLT54892.2023.10022706
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Luxembourgish is a West Germanic language spoken by roughly 390,000 people, mainly in Luxembourg. It is one of Europe's under-described and under-resourced languages, not extensively investigated in the context of speech recognition. We explore the self-supervised multilingual learning of Luxembourgish speech representations for the speech recognition downstream task. We show that learning cross-lingual representations is essential for low-resourced languages such as Luxembourgish. Learning cross-lingual representations and rescoring the output transcriptions with language modelling while using only 4 hours of labelled speech achieves a word error rate of 15.1% and improves our Transfer Learning baseline model relatively by 33.1% and absolutely by 7.5%. Increasing the amount of labelled speech to 14 hours yields a significant performance gain resulting in a 9.3% word error rate.
引用
收藏
页码:792 / 797
页数:6
相关论文
共 50 条
  • [1] Cross-lingual Self-Supervised Speech Representations for Improved Dysarthric Speech Recognition
    Hernandez, Abner
    Perez-Toro, Paula Andrea
    Noeth, Elmar
    Orozco-Arroyave, Juan Rafael
    Maier, Andreas
    Yang, Seung Hee
    [J]. INTERSPEECH 2022, 2022, : 51 - 55
  • [2] Speech Emotion Recognition with Cross-lingual Databases
    Chiou, Bo-Chang
    Chen, Chia-Ping
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 558 - 561
  • [3] Improving Cross-Lingual Transfer Learning for End-to-End Speech Recognition with Speech Translation
    Wang, Changhan
    Pino, Juan
    Gu, Jiatao
    [J]. INTERSPEECH 2020, 2020, : 4731 - 4735
  • [4] CLIoS: Cross-lingual Induction of Speech Recognition Grammars
    Perera, Nadine
    Pitz, Michael
    Pinkal, Manfred
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 2487 - 2494
  • [5] Unsupervised Cross-lingual Representation Learning for Speech Recognition
    Conneau, Alexis
    Baevski, Alexei
    Collobert, Ronan
    Mohamed, Abdelrahman
    Auli, Michael
    [J]. INTERSPEECH 2021, 2021, : 2426 - 2430
  • [6] XTREME-S: Evaluating Cross-lingual Speech Representations
    Conneau, Alexis
    Bapna, Ankur
    Zhang, Yu
    Ma, Min
    von Platen, Patrick
    Lozhkov, Anton
    Cherry, Colin
    Jia, Ye
    Rivera, Clara
    Kale, Mihir
    Van Esch, Daan
    Axelrod, Vera
    Khanuja, Simran
    Clark, Jonathan H.
    Firat, Orhan
    Auli, Michael
    Ruder, Sebastian
    Riesa, Jason
    Johnson, Melvin
    [J]. INTERSPEECH 2022, 2022, : 3248 - 3252
  • [7] Cross-lingual Dialog Model for Speech to Speech Translation
    Ettelaie, Emil
    Georgiou, Panayiotis G.
    Narayanan, Shrikanth
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1173 - 1176
  • [8] CROSS-LINGUAL AND MULTILINGUAL SPEECH EMOTION RECOGNITION ON ENGLISH AND FRENCH
    Neumann, Michael
    Ngoc Thang Vu
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5769 - 5773
  • [9] Cross-lingual Speech Emotion Recognition through Factor Analysis
    Desplanques, Brecht
    Demuynck, Kris
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3648 - 3652
  • [10] Improving hate speech detection using Cross-Lingual Learning
    Firmino, Anderson Almeida
    Baptista, Claudio de Souza
    de Paiva, Anselmo Cardoso
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235