Online Incremental Learning for Speaker-Adaptive Language Models

被引:0
|
作者
Hu, Chih Chi [1 ]
Liu, Bing [1 ]
Shen, John Paul [1 ]
Lane, Ian [1 ]
机构
[1] Carnegie Mellon Univ, Elect & Comp Engn, Pittsburgh, PA 15213 USA
关键词
Automatic Speech Recognition; Online Learning; Language Modeling; Speaker-Adaptation; Speaker Specific Modeling; Recurrent Neural Networks; ADAPTATION;
D O I
10.21437/Interspeech.2018-2259
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice control is a prominent interaction method on personal computing devices. While automatic speech recognition (ASR) systems are readily applicable for large audiences, there is room for further adaptation at the edge, ie. locally on devices, targeted for individual users. In this work, we explore improving ASR systems over time through a user's own interactions. Our online learning approach for speaker-adaptive language modeling leverages a user's most recent utterances to enhance the speaker dependent features and traits. We experiment with the Large Vocabulary Continuous Speech Recognition corpus Tedlium v2, and demonstrate an average reduction in perplexity (PPL) of 19.18% and average relative reduction in word error rate (WER) of 2.80% compared to a state-of-the-art baseline on Tedlium v2.
引用
收藏
页码:3363 / 3367
页数:5
相关论文
共 50 条
  • [1] A compact model for speaker-adaptive training
    Anastasakos, T
    McDonough, J
    Schwartz, R
    Makhoul, J
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1137 - 1140
  • [2] Comparing Speaker-Dependent and Speaker-Adaptive Acoustic Models for Recognizing Dysarthric Speech
    Rudzicz, Frank
    ASSETS'07: PROCEEDINGS OF THE NINTH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY, 2007, : 255 - 256
  • [3] Integrated speaker-adaptive speech synthesis
    Wan, Moquan
    Degottex, Gilles
    Gales, Mark J. F.
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 705 - 711
  • [4] On Speaker-Independent, Speaker-Dependent, and Speaker-Adaptive Speech Recognition
    Huang, Xuedong
    Lee, Kai-Fu
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1993, 1 (02): : 150 - 157
  • [5] Comparison of Gender- and Speaker-adaptive Emotion Recognition
    Sidorov, Maxim
    Ultes, Stefan
    Schmitt, Alexander
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3476 - 3480
  • [6] Speaker-adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation
    Yu, Dong
    Deng, Li
    Acero, Alex
    COMPUTER SPEECH AND LANGUAGE, 2007, 21 (01): : 72 - 87
  • [7] A Robust Speaker-Adaptive and Text-Prompted Speaker Verification System
    Hong, Qingyang
    Wang, Sheng
    Liu, Zhijian
    BIOMETRIC RECOGNITION (CCBR 2014), 2014, 8833 : 385 - 393
  • [8] A robust speaker-adaptive and text-prompted speaker verification system
    Hong, Qingyang, 1600, Springer Verlag (8833):
  • [9] Dysarthric Speech Recognition Using Dysarthria-Severity-Dependent and Speaker-Adaptive Models
    Kim, Myung Jong
    Yoo, Joohong
    Kim, Hoirin
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3589 - 3593
  • [10] Speaker-Adaptive Multimodal Prediction Model for Listener Responses
    de Kok, Iwan
    Heylen, Dirk
    Morency, Louis-Philippe
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 51 - 58