Modeling Dialectal Variation for Swiss German Automatic Speech Recognition

被引：4

作者：

Khosravani, Abbas ^{[1
]}

Garner, Philip N. ^{[1
]}

Lazaridis, Alexandros ^{[2
]}

机构：

[1] Idiap Res Inst, Martigny, Switzerland

[2] Swisscom AG, Data Analyt & AI Grp, Bern, Switzerland

来源：

INTERSPEECH 2021 | 2021年

关键词：

Speech recognition; Wav2vec; dialectal lexicon; Swiss German; multi-dialect; Swisscom; voice assistant; TV Box;

D O I：

10.21437/Interspeech.2021-1735

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

We describe a speech recognition system for Swiss German, a dialectal spoken language in German-speaking Switzerland. Swiss German has no standard orthography, with a significant variation in its written form. To alleviate the uncertainty associated with this variability, we automatically generate a lexicon from which multiple written forms of a given word in any dialect can be generated. The lexicon is built from a small (incomplete) handcrafted lexicon designed by linguistic experts and contains forms of common words in various Swiss German dialects. We exploit the powerful speech representation of self supervised acoustic pre-training (wav2vec) to address the lowresource nature of the spoken dialects. The proposed approach results in an overall relative improvement of 9% word error rate compared to one based on an expert-generated lexicon for our TV Box voice assistant application.

引用

页码：2896 / 2900

页数：5

共 50 条

[21] Prosody modeling for automatic speech recognition and understanding
Shriberg, E
Stolcke, A
MATHEMATICAL FOUNDATIONS OF SPEECH AND LANGUAGE PROCESSING, 2004, 138 : 105 - 114
[22] FEDERATED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION
Cui, Xiaodong
Lu, Songtao
Kingsbury, Brian
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6748 - 6752
[23] Duration Modeling in Automatic Recited Speech Recognition
Alotaibi, Yousef A.
Yakoub, Mohammed Sidi
Meftah, Ali
Selouani, Sid-Ahmed
2016 39TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2016, : 323 - 326
[24] SELECTION AND COMBINATION OF HYPOTHESES FOR DIALECTAL SPEECH RECOGNITION
Soto, Victor
Siohan, Olivier
Elfeky, Mohamed
Moreno, Pedro
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5845 - 5849
[25] A study on Yunnan dialectal Chinese speech recognition
Pu, Yuan-Yuan
Yang, Jian
Wei, Hong
Xu, Dan
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 2760 - 2764
[26] Multi-dialectal Spanish speech recognition
Nogueiras, A
Caballero, M
Moreno, A
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 841 - 844
[27] Implicit modelling of pronunciation variation in automatic speech recognition
Hain, T
SPEECH COMMUNICATION, 2005, 46 (02) : 171 - 188
[28] Automatic Speech Recognition in German: A Detailed Error Analysis
Wirth, Johannes
Peinl, Rene
2022 IEEE INTERNATIONAL CONFERENCE ON OMNI-LAYER INTELLIGENT SYSTEMS (IEEE COINS 2022), 2022, : 100 - 107
[29] Lexical modeling of non-native speech for automatic speech recognition
Livescu, K
Glass, J
2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1683 - 1686
[30] A Decade of Discriminative Language Modeling for Automatic Speech Recognition
Saraclar, Murat
Dikici, Erinc
Arisoy, Ebru
SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 11 - 22

← 1 2 3 4 5 →