Using prompts to produce quality corpus for training automatic speech recognition systems

被引:0
|
作者
Lecouteux, Benjamin [1 ]
Linares, Georges [1 ]
机构
[1] Univ Avignon, LIA, Avignon, France
关键词
speech recognition; closed captioning; corpus building; automatic segmentation;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In this paper we present an integrated unsupervised method to produce a quality corpus for training automatic speech recognition system (ASR) using prompts or closed captions. Closed captions and prompts do not always have timestamps and do not necessarily correspond to the exact speech. We propose a method allowing to extract quality corpus from imperfect transcript. The proposed approach works in two steps. During the search, the ASR system finds matching segments in a large prompt database. Matching segments are then used inside a Driven Decoding Algorithm (DDA) to produce a high quality corpus. Results show a F-measure of 96% in term of spotting while the DDA corrects the output according to the prompts: a high quality corpus is easily extracted. (1)
引用
收藏
页码:820 / 825
页数:6
相关论文
共 50 条
  • [41] Improving the Quality of Automatic Speech Recognition in Trucks
    Korenevsky, Maxim
    Medennikov, Ivan
    Shchemelinin, Vadim
    Speech and Computer, 2016, 9811 : 362 - 369
  • [42] PaSCoNT - Parallel Speech Corpus of Northern-central Thai for automatic speech recognition
    Taerungruang, Supawat
    Taninpong, Phimphaka
    Chunwijitra, Vataya
    Thatphithakkul, Sumonmas
    Kasuriya, Sawit
    Inthanon, Viroj
    Paksaranuwat, Pawat
    Thumronglaohapun, Salinee
    Nakharutai, Nawapon
    Inkeaw, Papangkorn
    Bootkrajang, Jakramate
    COMPUTER SPEECH AND LANGUAGE, 2025, 89
  • [43] Speech corpus recycling for acoustic cross-domain environments for automatic speech recognition
    Ichikawa, Osamu
    Rennie, Steven J.
    Fukuda, Takashi
    Willett, Daniel
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2016, 37 (02) : 55 - 65
  • [44] RODIGITS - A ROMANIAN CONNECTED-DIGITS SPEECH CORPUS FOR AUTOMATIC SPEECH AND SPEAKER RECOGNITION
    Georgescu, Alexandru Lucian
    Caranica, Alexandru
    Cucu, Horia
    Burileanu, Corneliu
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2018, 80 (03): : 45 - 62
  • [45] ALGERIAN ARABIC SPEECH DATABASE (ALGASD): CORPUS DESIGN AND AUTOMATIC SPEECH RECOGNITION APPLICATION
    Droua-Hamdani, Ghania
    Selouani, Sid Ahmed
    Boudraa, Malika
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2010, 35 (2C): : 157 - 166
  • [46] Automatic speech activity detection, source localization, and speech recognition on the CHIL seminar corpus
    Macho, D
    Padrell, J
    Abad, A
    Nadeu, C
    Hernando, J
    McDonough, J
    Wölfel, M
    Klee, W
    Omologo, M
    Brutti, A
    Svaizer, P
    Potamianos, G
    Chu, SM
    2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 877 - 880
  • [47] DATA-FILTERING METHODS FOR SELF-TRAINING OF AUTOMATIC SPEECH RECOGNITION SYSTEMS
    Georgescu, Alexandru-Lucian
    Manolache, Cristian
    Oneata, Dan
    Cucu, Horia
    Burileanu, Corneliu
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 141 - 147
  • [48] Data Balancing for Efficient Training of Hybrid ANN/HMM Automatic Speech Recognition Systems
    Garcia-Moral, Ana Isabel
    Solera-Urena, Ruben
    Pelaez-Moreno, Carmen
    Daiz-de-Maria, Fernando
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (03): : 468 - 481
  • [49] Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems
    Lanchantin, P.
    Gales, M. J. F.
    Karanasou, P.
    Liu, X.
    Qian, Y.
    Wang, L.
    Woodland, P. C.
    Zhang, C.
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3057 - 3061
  • [50] Improved Noisy Student Training for Automatic Speech Recognition
    Park, Daniel S.
    Zhang, Yu
    Jia, Ye
    Han, Wei
    Chiu, Chung-Cheng
    Li, Bo
    Wu, Yonghui
    Le, Quoc, V
    INTERSPEECH 2020, 2020, : 2817 - 2821