Open Source German Distant Speech Recognition: Corpus and Acoustic Model

被引：15

作者：

Radeck-Arneth, Stephan ^{[1
,2
]}

Milde, Benjamin ^{[1
]}

Lange, Arvid ^{[1
,2
]}

Gouvea, Evandro

Radomski, Stefan ^{[1
]}

Muehlhaeuser, Max ^{[1
]}

Biemann, Chris ^{[1
]}

机构：

[1] Tech Univ Darmstadt, Dept Comp Sci, Language Technol Grp, Darmstadt, Germany

[2] Tech Univ Darmstadt, Dept Comp Sci, Telecooperat Grp, Darmstadt, Germany

来源：

TEXT, SPEECH, AND DIALOGUE (TSD 2015) | 2015年 / 9302卷

关键词：

German speech recognition; Open source; Speech corpus; Distant speech recognition; Speaker-independent;

D O I：

10.1007/978-3-319-24033-6_54

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a new freely available corpus for German distant speech recognition and report speaker-independent word error rate (WER) results for two open source speech recognizers trained on this corpus. The corpus has been recorded in a controlled environment with three different microphones at a distance of one meter. It comprises 180 different speakers with a total of 36 hours of audio recordings. We show recognition results with the open source toolkit Kaldi (20.5% WER) and PocketSphinx (39.6% WER) and make a complete open source solution for German distant speech recognition possible.

引用

页码：480 / 488

页数：9

共 50 条

[1] Fast-LSTM Acoustic Model for Distant Speech Recognition
Trianto, Rezki
Tai, Tzu-Chiang
Wang, Jia-Ching
2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2018,
[2] A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline
Khassanov, Yerbolat
Mussakhojayeva, Saida
Mirzakhmetov, Almas
Adiyev, Alen
Nurpeiissov, Mukhamet
Varol, Huseyin Atakan
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 697 - 706
[3] An open and free Speech Corpus for Speaker Recognition: The FSCSR Speech Corpus
Bouziane, Ayoub
Kadi, Houda
Hourri, Soufiane
Kharroubi, Jamal
2016 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA), 2016,
[4] AISHELL-1: AN OPEN-SOURCE MANDARIN SPEECH CORPUS AND A SPEECH RECOGNITION BASELINE
Bu, Hui
Du, Jiayu
Na, Xingyu
Wu, Bengu
Zheng, Hao
2017 20TH CONFERENCE OF THE ORIENTAL CHAPTER OF THE INTERNATIONAL COORDINATING COMMITTEE ON SPEECH DATABASES AND SPEECH I/O SYSTEMS AND ASSESSMENT (O-COCOSDA), 2017, : 58 - 62
[5] LibriVoxDeEn: A Corpus for German-to-English Speech Translation and German Speech Recognition
Beilharz, Benjamin
Sun, Xin
Karimova, Sariya
Riezler, Stefan
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3590 - 3594
[6] Acoustic Model Adaptation for Emotional Speech Recognition Using Twitter-Based Emotional Speech Corpus
Kosaka, Tetsuo
Aizawa, Yoshitaka
Kato, Masaharu
Nose, Takashi
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1747 - 1751
[7] Developing an Open-Source Corpus of Yoruba Speech
Gutkin, Alexander
Demirsahin, Isin
Kjartansson, Oddur
Rivera, Clara
Tnbastin, Kola
INTERSPEECH 2020, 2020, : 404 - 408
[8] Improving Speech Recognition for the Elderly: A New Corpus of Elderly Japanese Speech and Investigation of Acoustic Modeling for Speech Recognition
Fukuda, Meiko
Nishizaki, Hiromitsu
Iribe, Yurie
Nishimura, Ryota
Kitaoka, Norihide
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6578 - 6585
[9] Open Source Speech Recognition on Edge Devices
Peinl, Rene
Rizk, Basem
Szabad, Robert
2020 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER INFORMATION TECHNOLOGIES (ACIT), 2020, : 441 - 445
[10] Speech Recognition Based on Open Source Speech Processing Software
Klosowski, Piotr
Dustor, Adam
Izydorczyk, Jacek
Kotas, Jan
Slimok, Jacek
COMPUTER NETWORKS, CN 2014, 2014, 431 : 308 - 317

← 1 2 3 4 5 →