CEASR: A Corpus for Evaluating Automatic Speech Recognition

被引:0
|
作者
Ulasik, Malgorzata Anna [1 ]
Huerlimann, Manuela
Germann, Fabian
Gedik, Esin
Benites, Fernando
Cieliebak, Mark
机构
[1] Zurich Univ Appl Sci, Winterthur, Switzerland
关键词
automatic speech recognition; evaluation; speech corpus; ASR systems;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we present CEASR, a Corpus for Evaluating the quality of Automatic Speech Recognition (ASR). It is a data set based on public speech corpora, containing metadata along with transcripts generated by several modern state-of-the-art ASR systems. CEASR provides this data in a unified structure, consistent across all corpora and systems, with normalised transcript texts and metadata. We use CEASR to evaluate the quality of ASR systems by calculating an average Word Error Rate (WER) per corpus, per system and per corpus-system pair. Our experiments show a substantial difference in accuracy between commercial versus open-source ASR tools as well as differences up to a factor ten for single systems on different corpora. Using CEASR allowed us to very efficiently and easily obtain these results. Our corpus enables researchers to perform ASR-related evaluations and various in-depth analyses with noticeably reduced effort, i.e. without the need to collect, process and transcribe the speech data themselves.
引用
收藏
页码:6477 / 6485
页数:9
相关论文
共 50 条
  • [1] Corpus for automatic speech recognition
    Adda-Decker, Martine
    [J]. REVUE FRANCAISE DE LINGUISTIQUE APPLIQUEE, 2007, 12 (01): : 71 - 84
  • [2] Modern standard Arabic speech corpus for implementing and evaluating automatic continuous speech recognition systems
    Abushariah, Mohammad Abd-Alrahman Mahmoud
    Ainon, Raja Noor
    Zainuddin, Roziati
    Alqudah, Assal Ali Mustafa
    Ahmed, Moustafa Elshafei
    Khalifa, Othman Omran
    [J]. JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2012, 349 (07): : 2215 - 2242
  • [3] Creation of Marathi Speech Corpus for Automatic Speech Recognition
    Gaikwad, Santosh
    Gawali, Bharti
    Mehrotra, Suresh
    [J]. 2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
  • [4] The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition
    Mukiibi, Jonathan
    Katumba, Andrew
    Nakatumba-Nabende, Joyce
    Hussein, Ali
    Meyer, Josh
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1945 - 1954
  • [5] Multimodal English corpus for automatic speech recognition
    Kunka, Bartosz
    Kupryjanow, Adam
    Dalka, Piotr
    Bratoszewski, Piotr
    Szczodrak, Maciej
    Spaleniak, Pawel
    Szykulski, Marcin
    Czyzewski, Andrzej
    [J]. 2013 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2013, : 106 - 111
  • [6] EVALUATING VAD FOR AUTOMATIC SPEECH RECOGNITION
    Tong, Sibo
    Chen, Nanxin
    Qian, Yanmin
    Yu, Kai
    [J]. 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 2308 - 2314
  • [7] Bangladeshi Bangla speech corpus for automatic speech recognition research
    Kibria, Shafkat
    Samin, Ahnaf Mozib
    Kobir, M. Humayon
    Rahman, M. Shahidur
    Selim, M. Reza
    Iqbal, M. Zafar
    [J]. SPEECH COMMUNICATION, 2022, 136 : 84 - 97
  • [8] RSC: A Romanian Read Speech Corpus for Automatic Speech Recognition
    Georgescu, Alexandru-Lucian
    Cucu, Horia
    Buzo, Andi
    Burileanu, Corneliu
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6606 - 6612
  • [9] Chhattisgarhi speech corpus for research and development in automatic speech recognition
    Londhe, Narendra D.
    Kshirsagar, Ghanahshyam B.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (02) : 193 - 210
  • [10] Bangladeshi Bangla speech corpus for automatic speech recognition research
    Kibria, Shafkat
    Samin, Ahnaf Mozib
    Kobir, M. Humayon
    Rahman, M. Shahidur
    Selim, M. Reza
    Iqbal, M. Zafar
    [J]. Speech Communication, 2022, 136 : 84 - 97