Open Source German Distant Speech Recognition: Corpus and Acoustic Model

被引:15
|
作者
Radeck-Arneth, Stephan [1 ,2 ]
Milde, Benjamin [1 ]
Lange, Arvid [1 ,2 ]
Gouvea, Evandro
Radomski, Stefan [1 ]
Muehlhaeuser, Max [1 ]
Biemann, Chris [1 ]
机构
[1] Tech Univ Darmstadt, Dept Comp Sci, Language Technol Grp, Darmstadt, Germany
[2] Tech Univ Darmstadt, Dept Comp Sci, Telecooperat Grp, Darmstadt, Germany
来源
关键词
German speech recognition; Open source; Speech corpus; Distant speech recognition; Speaker-independent;
D O I
10.1007/978-3-319-24033-6_54
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new freely available corpus for German distant speech recognition and report speaker-independent word error rate (WER) results for two open source speech recognizers trained on this corpus. The corpus has been recorded in a controlled environment with three different microphones at a distance of one meter. It comprises 180 different speakers with a total of 36 hours of audio recordings. We show recognition results with the open source toolkit Kaldi (20.5% WER) and PocketSphinx (39.6% WER) and make a complete open source solution for German distant speech recognition possible.
引用
收藏
页码:480 / 488
页数:9
相关论文
共 50 条
  • [1] Fast-LSTM Acoustic Model for Distant Speech Recognition
    Trianto, Rezki
    Tai, Tzu-Chiang
    Wang, Jia-Ching
    2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2018,
  • [2] A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline
    Khassanov, Yerbolat
    Mussakhojayeva, Saida
    Mirzakhmetov, Almas
    Adiyev, Alen
    Nurpeiissov, Mukhamet
    Varol, Huseyin Atakan
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 697 - 706
  • [3] An open and free Speech Corpus for Speaker Recognition: The FSCSR Speech Corpus
    Bouziane, Ayoub
    Kadi, Houda
    Hourri, Soufiane
    Kharroubi, Jamal
    2016 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA), 2016,
  • [4] AISHELL-1: AN OPEN-SOURCE MANDARIN SPEECH CORPUS AND A SPEECH RECOGNITION BASELINE
    Bu, Hui
    Du, Jiayu
    Na, Xingyu
    Wu, Bengu
    Zheng, Hao
    2017 20TH CONFERENCE OF THE ORIENTAL CHAPTER OF THE INTERNATIONAL COORDINATING COMMITTEE ON SPEECH DATABASES AND SPEECH I/O SYSTEMS AND ASSESSMENT (O-COCOSDA), 2017, : 58 - 62
  • [5] LibriVoxDeEn: A Corpus for German-to-English Speech Translation and German Speech Recognition
    Beilharz, Benjamin
    Sun, Xin
    Karimova, Sariya
    Riezler, Stefan
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3590 - 3594
  • [6] Acoustic Model Adaptation for Emotional Speech Recognition Using Twitter-Based Emotional Speech Corpus
    Kosaka, Tetsuo
    Aizawa, Yoshitaka
    Kato, Masaharu
    Nose, Takashi
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1747 - 1751
  • [7] Developing an Open-Source Corpus of Yoruba Speech
    Gutkin, Alexander
    Demirsahin, Isin
    Kjartansson, Oddur
    Rivera, Clara
    Tnbastin, Kola
    INTERSPEECH 2020, 2020, : 404 - 408
  • [8] Improving Speech Recognition for the Elderly: A New Corpus of Elderly Japanese Speech and Investigation of Acoustic Modeling for Speech Recognition
    Fukuda, Meiko
    Nishizaki, Hiromitsu
    Iribe, Yurie
    Nishimura, Ryota
    Kitaoka, Norihide
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6578 - 6585
  • [9] Open Source Speech Recognition on Edge Devices
    Peinl, Rene
    Rizk, Basem
    Szabad, Robert
    2020 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER INFORMATION TECHNOLOGIES (ACIT), 2020, : 441 - 445
  • [10] Speech Recognition Based on Open Source Speech Processing Software
    Klosowski, Piotr
    Dustor, Adam
    Izydorczyk, Jacek
    Kotas, Jan
    Slimok, Jacek
    COMPUTER NETWORKS, CN 2014, 2014, 431 : 308 - 317