IMPROVING LUXEMBOURGISH SPEECH RECOGNITION WITH CROSS-LINGUAL SPEECH REPRESENTATIONS

被引：0

作者：

Le Minh Nguyen ^{[1
]}

Nayak, Shekhar ^{[1
]}

Coler, Matt ^{[1
]}

机构：

[1] Univ Groningen, Groningen, Netherlands

来源：

2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT | 2022年

关键词：

Luxembourgish; multilingual speech recognition; language modelling; wav2vec; 2.0; XLSR-53; under-resourced language;

D O I：

10.1109/SLT54892.2023.10022706

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Luxembourgish is a West Germanic language spoken by roughly 390,000 people, mainly in Luxembourg. It is one of Europe's under-described and under-resourced languages, not extensively investigated in the context of speech recognition. We explore the self-supervised multilingual learning of Luxembourgish speech representations for the speech recognition downstream task. We show that learning cross-lingual representations is essential for low-resourced languages such as Luxembourgish. Learning cross-lingual representations and rescoring the output transcriptions with language modelling while using only 4 hours of labelled speech achieves a word error rate of 15.1% and improves our Transfer Learning baseline model relatively by 33.1% and absolutely by 7.5%. Increasing the amount of labelled speech to 14 hours yields a significant performance gain resulting in a 9.3% word error rate.

引用

页码：792 / 797

页数：6

共 50 条

[1] Cross-lingual Self-Supervised Speech Representations for Improved Dysarthric Speech Recognition
Hernandez, Abner
Perez-Toro, Paula Andrea
Noeth, Elmar
Orozco-Arroyave, Juan Rafael
Maier, Andreas
Yang, Seung Hee
[J]. INTERSPEECH 2022, 2022, : 51 - 55
[2] Speech Emotion Recognition with Cross-lingual Databases
Chiou, Bo-Chang
Chen, Chia-Ping
[J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 558 - 561
[3] Improving Cross-Lingual Transfer Learning for End-to-End Speech Recognition with Speech Translation
Wang, Changhan
Pino, Juan
Gu, Jiatao
[J]. INTERSPEECH 2020, 2020, : 4731 - 4735
[4] CLIoS: Cross-lingual Induction of Speech Recognition Grammars
Perera, Nadine
Pitz, Michael
Pinkal, Manfred
[J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 2487 - 2494
[5] Unsupervised Cross-lingual Representation Learning for Speech Recognition
Conneau, Alexis
Baevski, Alexei
Collobert, Ronan
Mohamed, Abdelrahman
Auli, Michael
[J]. INTERSPEECH 2021, 2021, : 2426 - 2430
[6] XTREME-S: Evaluating Cross-lingual Speech Representations
Conneau, Alexis
Bapna, Ankur
Zhang, Yu
Ma, Min
von Platen, Patrick
Lozhkov, Anton
Cherry, Colin
Jia, Ye
Rivera, Clara
Kale, Mihir
Van Esch, Daan
Axelrod, Vera
Khanuja, Simran
Clark, Jonathan H.
Firat, Orhan
Auli, Michael
Ruder, Sebastian
Riesa, Jason
Johnson, Melvin
[J]. INTERSPEECH 2022, 2022, : 3248 - 3252
[7] Cross-lingual Dialog Model for Speech to Speech Translation
Ettelaie, Emil
Georgiou, Panayiotis G.
Narayanan, Shrikanth
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1173 - 1176
[8] CROSS-LINGUAL AND MULTILINGUAL SPEECH EMOTION RECOGNITION ON ENGLISH AND FRENCH
Neumann, Michael
Ngoc Thang Vu
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5769 - 5773
[9] Cross-lingual Speech Emotion Recognition through Factor Analysis
Desplanques, Brecht
Demuynck, Kris
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3648 - 3652
[10] Improving hate speech detection using Cross-Lingual Learning
Firmino, Anderson Almeida
Baptista, Claudio de Souza
de Paiva, Anselmo Cardoso
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235

← 1 2 3 4 5 →