Semi-supervised and unsupervised discriminative language model training for automatic speech recognition

被引:6
|
作者
Dikici, Erinc [1 ]
Saraclar, Murat [1 ]
机构
[1] Bogazici Univ, Dept Elect & Elect Engn, TR-34342 Istanbul, Turkey
关键词
Discriminative language modeling; Semi-supervised training; Unsupervised training; CLASSIFICATION; RERANKING; RANKING;
D O I
10.1016/j.specom.2016.07.004
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Discriminative language modeling aims to reduce the error rates by rescoring the output of an automatic speech recognition (ASR) system. Discriminative language model (DLM) training conventionally follows a supervised approach, using acoustic recordings together with their manual transcriptions (reference) as training data, and the recognition performance is improved with increasing amount of such matched data. In this study we investigate the case where matched data for DLM training is limited or is not available at all, and explore methods to improve ASR accuracy by incorporating acoustic and text data that come from separate sources. For semi-supervised training, we utilize a confusion model to generate artificial hypotheses instead of the real ASR N-bests. For unsupervised training, we propose three target output selection methods to take over the missing reference. We handle this task both as a structured prediction and a reranking problem and employ two different variants of the WER-sensitive perceptron algorithm. We show that significant improvement over baseline ASR accuracy is obtained even when there is no transcribed acoustic data available to train the DLM. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:54 / 63
页数:10
相关论文
共 50 条
  • [1] Unsupervised and semi-supervised adaptation of a hybrid speech recognition system
    Trmal, Jan
    Zelinka, Jan
    Mueller, Ludek
    [J]. PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 527 - 530
  • [2] Semi-supervised Model for Emotion Recognition in Speech
    Pereira, Ingryd
    Santos, Diego
    Maciel, Alexandre
    Barros, Pablo
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT I, 2018, 11139 : 791 - 800
  • [3] Improved low-resource Somali speech recognition by semi-supervised acoustic and language model training
    Biswas, Astik
    Menon, Raghav
    van der Westhuizen, Ewald
    Niesler, Thomas
    [J]. INTERSPEECH 2019, 2019, : 3008 - 3012
  • [4] Semi-Supervised Training of Language Model on Spanish Conversational Telephone Speech Data
    Egorova, Ekaterina
    Luque Serrano, Jordi
    [J]. SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES, 2016, 81 : 114 - 120
  • [5] Semi-Supervised Speech Recognition Acoustic Model Training Using Policy Gradient
    Chung, Hoon
    Lee, Sung Joo
    Jeon, Hyeong Bae
    Park, Jeon Gue
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (10):
  • [6] Lightly supervised vs. semi-supervised training of acoustic model on Luxembourgish for low-resource automatic speech recognition
    Vesely, Karel
    Segura, Carlos
    Szoke, Igor
    Luque, Jordi
    Cernocky, Jan Honza
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2883 - 2887
  • [7] Performance Comparison of Training Algorithms for Semi-Supervised Discriminative Language Modeling
    Dikici, Erinc
    Celebi, Arda
    Saraclar, Murat
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 206 - 209
  • [8] SEMI-SUPERVISED LEARNING OF LANGUAGE MODEL USING UNSUPERVISED TOPIC MODEL
    Bai, Shuanhu
    Huang, Chien-Lin
    Ma, Bin
    Li, Haizhou
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5382 - 5385
  • [9] A Discriminative Model for Semi-Supervised Learning
    Balcan, Maria-Florina
    Blum, Avrim
    [J]. JOURNAL OF THE ACM, 2010, 57 (03)
  • [10] A Language Model Based Pseudo-Sample Deliberation for Semi-supervised Speech Recognition
    Yi, Cheng
    Wang, Jianzong
    Cheng, Ning
    Zhou, Shiyu
    Xu, Bo
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,