Using prompts to produce quality corpus for training automatic speech recognition systems

被引:0
|
作者
Lecouteux, Benjamin [1 ]
Linares, Georges [1 ]
机构
[1] Univ Avignon, LIA, Avignon, France
关键词
speech recognition; closed captioning; corpus building; automatic segmentation;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In this paper we present an integrated unsupervised method to produce a quality corpus for training automatic speech recognition system (ASR) using prompts or closed captions. Closed captions and prompts do not always have timestamps and do not necessarily correspond to the exact speech. We propose a method allowing to extract quality corpus from imperfect transcript. The proposed approach works in two steps. During the search, the ASR system finds matching segments in a large prompt database. Matching segments are then used inside a Driven Decoding Algorithm (DDA) to produce a high quality corpus. Results show a F-measure of 96% in term of spotting while the DDA corrects the output according to the prompts: a high quality corpus is easily extracted. (1)
引用
收藏
页码:820 / 825
页数:6
相关论文
共 50 条
  • [31] Corpus Construction for Deaf Speakers and Analysis by Automatic Speech Recognition
    Kobayashi, Akio
    Yasu, Keiichi
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2294 - 2298
  • [32] MinSpeech: A Corpus of Southern Min Dialect for Automatic Speech Recognition
    Lin, Jiayan
    Lu, Shenghui
    Huang, Hukai
    Guan, Wenhao
    Xu, Binbin
    Bu, Hui
    Hong, Qingyang
    Li, Lin
    INTERSPEECH 2024, 2024, : 2330 - 2334
  • [33] TED-LIUM: an Automatic Speech Recognition dedicated corpus
    Rousseau, Anthony
    Deleglise, Paul
    Esteve, Yannick
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 125 - 129
  • [34] An audio-visual corpus for multimodal automatic speech recognition
    Czyzewski, Andrzej
    Kostek, Bozena
    Bratoszewski, Piotr
    Kotus, Jozef
    Szykulski, Marcin
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2017, 49 (02) : 167 - 192
  • [35] An audio-visual corpus for multimodal automatic speech recognition
    Andrzej Czyzewski
    Bozena Kostek
    Piotr Bratoszewski
    Jozef Kotus
    Marcin Szykulski
    Journal of Intelligent Information Systems, 2017, 49 : 167 - 192
  • [36] A speech corpus of Quechua Collao for automatic dimensional emotion recognition
    Paccotacya-Yanque, Rosa Y. G.
    Huanca-Anquise, Candy A.
    Escalante-Calcina, Judith
    Ramos-Lovon, Wilber R.
    Cuno-Parari, Alvaro E.
    SCIENTIFIC DATA, 2022, 9 (01)
  • [37] A speech corpus of Quechua Collao for automatic dimensional emotion recognition
    Rosa Y. G. Paccotacya-Yanque
    Candy A. Huanca-Anquise
    Judith Escalante-Calcina
    Wilber R. Ramos-Lovón
    Álvaro E. Cuno-Parari
    Scientific Data, 9
  • [38] SPEECH DISFLUENCIES MODELING IN AUTOMATIC SPEECH RECOGNITION SYSTEMS
    Vasilisa, Verkhodanova O.
    Alexey, Karpov A.
    TOMSK STATE UNIVERSITY JOURNAL, 2012, (363): : 10 - +
  • [39] On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training
    Zhang, Jisi
    Zorila, Catalin
    Doddipatla, Rama
    Barker, Jon
    INTERSPEECH 2022, 2022, : 1056 - 1060
  • [40] Impact of a Newly Developed Modern Standard Arabic Speech Corpus on Implementing and Evaluating Automatic Continuous Speech Recognition Systems
    Abushariah, Mohammad A. M.
    Ainon, Raja N.
    Zainuddin, Roziati
    Al-Qatab, Bassam A.
    Alqudah, Assal A. M.
    SPOKEN DIALOGUE SYSTEMS FOR AMBIENT ENVIRONMENTS, 2010, 6392 : 1 - 12