Lattice-based lightly-supervised acoustic model training

被引:3
|
作者
Fainberg, Joachim [1 ]
Klejch, Ondrej [1 ]
Renals, Steve [1 ]
Bell, Peter [1 ]
机构
[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh, Midlothian, Scotland
来源
关键词
Automatic speech recognition; lightly supervised training; LF-MMI; broadcast media; TRANSCRIPTION; SELECTION;
D O I
10.21437/Interspeech.2019-2533
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In the broadcast domain there is an abundance of related text data and partial transcriptions, such as closed captions and subtitles. This text data can be used for lightly supervised training, in which text matching the audio is selected using an existing speech recognition model. Current approaches to light supervision typically filter the data based on matching error rates between the transcriptions and biased decoding hypotheses. In contrast, semi-supervised training does not require matching text data, instead generating a hypothesis using a background language model. State-of-the-art semi-supervised training uses lattice-based supervision with the lattice-free MMI (LF-MMI) objective function. We propose a technique to combine inaccurate transcriptions with the lattices generated for semisupervised training, thus preserving uncertainty in the lattice where appropriate. We demonstrate that this combined approach reduces the expected error rates over the lattices, and reduces the word error rate (WER) on a broadcast task.
引用
收藏
页码:1596 / 1600
页数:5
相关论文
共 50 条
  • [1] LATTICE-BASED UNSUPERVISED ACOUSTIC MODEL TRAINING
    Fraga-Silva, Thiago
    Gauvain, Jean-Luc
    Lamel, Lori
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4656 - 4659
  • [2] Lightly supervised and unsupervised acoustic model training
    Lamel, L
    Gauvain, JL
    Adda, G
    COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01): : 115 - 129
  • [3] Investigating lightly supervised acoustic model training
    Lamel, L
    Gauvain, JL
    Adda, G
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 477 - 480
  • [4] Lightly Supervised Acoustic Model Training on EPPS Recordings
    Paulik, Matthias
    Waibel, Alex
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 224 - +
  • [5] Search-Guided, Lightly-Supervised Training of Structured Prediction Energy Networks
    Rooshenas, Amirmohammad
    Zhang, Dongxu
    Sharma, Gopal
    McCallum, Andrew
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [6] Lightly supervised acoustic model training using consensus networks
    Chen, LZ
    Lamel, L
    Gauvain, JL
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 189 - 192
  • [7] Predicting Antimicrobial Resistance via Lightly-Supervised Learning
    Colbaugh, Rich
    Glass, Kristin
    2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 2428 - 2433
  • [8] Lightly-supervised Clustering Using Pairwise Constraint Propagation
    Huang, Jianbin
    Sun, Heli
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 765 - +
  • [9] Lightly Supervised Acoustic Model Training for Imprecisely and Asynchronously Transcribed Speech
    Mihajlik, Peter
    Balog, Andras
    2013 7TH CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN - COMPUTER DIALOGUE (SPED), 2013,
  • [10] Automatic Lecture Transcription Based on Discriminative Data Selection for Lightly Supervised Acoustic Model Training
    Li, Sheng
    Akita, Yuya
    Kawahara, Tatsuya
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (08): : 1545 - 1552