Modeling Dialectal Variation for Swiss German Automatic Speech Recognition

被引:4
|
作者
Khosravani, Abbas [1 ]
Garner, Philip N. [1 ]
Lazaridis, Alexandros [2 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Swisscom AG, Data Analyt & AI Grp, Bern, Switzerland
来源
关键词
Speech recognition; Wav2vec; dialectal lexicon; Swiss German; multi-dialect; Swisscom; voice assistant; TV Box;
D O I
10.21437/Interspeech.2021-1735
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
We describe a speech recognition system for Swiss German, a dialectal spoken language in German-speaking Switzerland. Swiss German has no standard orthography, with a significant variation in its written form. To alleviate the uncertainty associated with this variability, we automatically generate a lexicon from which multiple written forms of a given word in any dialect can be generated. The lexicon is built from a small (incomplete) handcrafted lexicon designed by linguistic experts and contains forms of common words in various Swiss German dialects. We exploit the powerful speech representation of self supervised acoustic pre-training (wav2vec) to address the lowresource nature of the spoken dialects. The proposed approach results in an overall relative improvement of 9% word error rate compared to one based on an expert-generated lexicon for our TV Box voice assistant application.
引用
收藏
页码:2896 / 2900
页数:5
相关论文
共 50 条
  • [21] Prosody modeling for automatic speech recognition and understanding
    Shriberg, E
    Stolcke, A
    MATHEMATICAL FOUNDATIONS OF SPEECH AND LANGUAGE PROCESSING, 2004, 138 : 105 - 114
  • [22] FEDERATED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION
    Cui, Xiaodong
    Lu, Songtao
    Kingsbury, Brian
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6748 - 6752
  • [23] Duration Modeling in Automatic Recited Speech Recognition
    Alotaibi, Yousef A.
    Yakoub, Mohammed Sidi
    Meftah, Ali
    Selouani, Sid-Ahmed
    2016 39TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2016, : 323 - 326
  • [24] SELECTION AND COMBINATION OF HYPOTHESES FOR DIALECTAL SPEECH RECOGNITION
    Soto, Victor
    Siohan, Olivier
    Elfeky, Mohamed
    Moreno, Pedro
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5845 - 5849
  • [25] A study on Yunnan dialectal Chinese speech recognition
    Pu, Yuan-Yuan
    Yang, Jian
    Wei, Hong
    Xu, Dan
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 2760 - 2764
  • [26] Multi-dialectal Spanish speech recognition
    Nogueiras, A
    Caballero, M
    Moreno, A
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 841 - 844
  • [27] Implicit modelling of pronunciation variation in automatic speech recognition
    Hain, T
    SPEECH COMMUNICATION, 2005, 46 (02) : 171 - 188
  • [28] Automatic Speech Recognition in German: A Detailed Error Analysis
    Wirth, Johannes
    Peinl, Rene
    2022 IEEE INTERNATIONAL CONFERENCE ON OMNI-LAYER INTELLIGENT SYSTEMS (IEEE COINS 2022), 2022, : 100 - 107
  • [29] Lexical modeling of non-native speech for automatic speech recognition
    Livescu, K
    Glass, J
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1683 - 1686
  • [30] A Decade of Discriminative Language Modeling for Automatic Speech Recognition
    Saraclar, Murat
    Dikici, Erinc
    Arisoy, Ebru
    SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 11 - 22