CEASR: A Corpus for Evaluating Automatic Speech Recognition

被引:0
|
作者
Ulasik, Malgorzata Anna [1 ]
Huerlimann, Manuela
Germann, Fabian
Gedik, Esin
Benites, Fernando
Cieliebak, Mark
机构
[1] Zurich Univ Appl Sci, Winterthur, Switzerland
关键词
automatic speech recognition; evaluation; speech corpus; ASR systems;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we present CEASR, a Corpus for Evaluating the quality of Automatic Speech Recognition (ASR). It is a data set based on public speech corpora, containing metadata along with transcripts generated by several modern state-of-the-art ASR systems. CEASR provides this data in a unified structure, consistent across all corpora and systems, with normalised transcript texts and metadata. We use CEASR to evaluate the quality of ASR systems by calculating an average Word Error Rate (WER) per corpus, per system and per corpus-system pair. Our experiments show a substantial difference in accuracy between commercial versus open-source ASR tools as well as differences up to a factor ten for single systems on different corpora. Using CEASR allowed us to very efficiently and easily obtain these results. Our corpus enables researchers to perform ASR-related evaluations and various in-depth analyses with noticeably reduced effort, i.e. without the need to collect, process and transcribe the speech data themselves.
引用
收藏
页码:6477 / 6485
页数:9
相关论文
共 50 条
  • [21] TED-LIUM: an Automatic Speech Recognition dedicated corpus
    Rousseau, Anthony
    Deleglise, Paul
    Esteve, Yannick
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 125 - 129
  • [22] An audio-visual corpus for multimodal automatic speech recognition
    Czyzewski, Andrzej
    Kostek, Bozena
    Bratoszewski, Piotr
    Kotus, Jozef
    Szykulski, Marcin
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2017, 49 (02) : 167 - 192
  • [23] An audio-visual corpus for multimodal automatic speech recognition
    Andrzej Czyzewski
    Bozena Kostek
    Piotr Bratoszewski
    Jozef Kotus
    Marcin Szykulski
    [J]. Journal of Intelligent Information Systems, 2017, 49 : 167 - 192
  • [24] A speech corpus of Quechua Collao for automatic dimensional emotion recognition
    Paccotacya-Yanque, Rosa Y. G.
    Huanca-Anquise, Candy A.
    Escalante-Calcina, Judith
    Ramos-Lovon, Wilber R.
    Cuno-Parari, Alvaro E.
    [J]. SCIENTIFIC DATA, 2022, 9 (01)
  • [25] A speech corpus of Quechua Collao for automatic dimensional emotion recognition
    Rosa Y. G. Paccotacya-Yanque
    Candy A. Huanca-Anquise
    Judith Escalante-Calcina
    Wilber R. Ramos-Lovón
    Álvaro E. Cuno-Parari
    [J]. Scientific Data, 9
  • [26] Generating and evaluating segmentations for automatic speech recognition of conversational telephone speech
    Tranter, SE
    Yu, K
    Evermann, G
    Woodland, RC
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 753 - 756
  • [27] PaSCoNT - Parallel Speech Corpus of Northern-central Thai for automatic speech recognition
    Taerungruang, Supawat
    Taninpong, Phimphaka
    Chunwijitra, Vataya
    Thatphithakkul, Sumonmas
    Kasuriya, Sawit
    Inthanon, Viroj
    Paksaranuwat, Pawat
    Thumronglaohapun, Salinee
    Nakharutai, Nawapon
    Inkeaw, Papangkorn
    Bootkrajang, Jakramate
    [J]. COMPUTER SPEECH AND LANGUAGE, 2025, 89
  • [28] Speech corpus recycling for acoustic cross-domain environments for automatic speech recognition
    Ichikawa, Osamu
    Rennie, Steven J.
    Fukuda, Takashi
    Willett, Daniel
    [J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2016, 37 (02) : 55 - 65
  • [29] RODIGITS - A ROMANIAN CONNECTED-DIGITS SPEECH CORPUS FOR AUTOMATIC SPEECH AND SPEAKER RECOGNITION
    Georgescu, Alexandru Lucian
    Caranica, Alexandru
    Cucu, Horia
    Burileanu, Corneliu
    [J]. UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2018, 80 (03): : 45 - 62
  • [30] ALGERIAN ARABIC SPEECH DATABASE (ALGASD): CORPUS DESIGN AND AUTOMATIC SPEECH RECOGNITION APPLICATION
    Droua-Hamdani, Ghania
    Selouani, Sid Ahmed
    Boudraa, Malika
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2010, 35 (2C): : 157 - 166