Online Incremental Learning for Speaker-Adaptive Language Models

被引:0
|
作者
Hu, Chih Chi [1 ]
Liu, Bing [1 ]
Shen, John Paul [1 ]
Lane, Ian [1 ]
机构
[1] Carnegie Mellon Univ, Elect & Comp Engn, Pittsburgh, PA 15213 USA
关键词
Automatic Speech Recognition; Online Learning; Language Modeling; Speaker-Adaptation; Speaker Specific Modeling; Recurrent Neural Networks; ADAPTATION;
D O I
10.21437/Interspeech.2018-2259
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice control is a prominent interaction method on personal computing devices. While automatic speech recognition (ASR) systems are readily applicable for large audiences, there is room for further adaptation at the edge, ie. locally on devices, targeted for individual users. In this work, we explore improving ASR systems over time through a user's own interactions. Our online learning approach for speaker-adaptive language modeling leverages a user's most recent utterances to enhance the speaker dependent features and traits. We experiment with the Large Vocabulary Continuous Speech Recognition corpus Tedlium v2, and demonstrate an average reduction in perplexity (PPL) of 19.18% and average relative reduction in word error rate (WER) of 2.80% compared to a state-of-the-art baseline on Tedlium v2.
引用
收藏
页码:3363 / 3367
页数:5
相关论文
共 50 条
  • [31] Prompt Tuning of Deep Neural Networks for Speaker-Adaptive Visual Speech Recognition
    Kim, Minsu
    Kim, Hyung-Il
    Ro, Yong Man
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (02) : 1042 - 1055
  • [32] Speaker-Adaptive Acoustic-Articulatory Inversion Using Cascaded Gaussian Mixture Regression
    Hueber, Thomas
    Girin, Laurent
    Alameda-Pineda, Xavier
    Bailly, Gerard
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (12) : 2246 - 2259
  • [33] Deep learning-based speaker-adaptive postfiltering with limited adaptation data for embedded text-to-speech synthesis systems
    Eren, Eray
    Demiroglu, Cenk
    COMPUTER SPEECH AND LANGUAGE, 2023, 81
  • [34] Unsupervised incremental online adaptation to unknown environment and speaker
    Yook, D
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 617 - 620
  • [35] Adaptive Incremental Principal Component Analysis in Nonstationary Online Learning Environments
    Ozawa, Seiichi
    Kawashima, Yuki
    Pang, Shaoning
    La Kasabov, Niko
    IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 2889 - +
  • [36] Adaptive and Personalized Exercise Generation for Online Language Learning
    Cui, Peng
    Sachan, Mrinmaya
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 10184 - 10198
  • [37] Adaptive Learning Material Recommendation in Online Language Education
    Wang, Shuhan
    Wu, Hao
    Kim, Ji Hun
    Andersen, Erik
    ARTIFICIAL INTELLIGENCE IN EDUCATION, AIED 2019, PT II, 2019, 11626 : 298 - 302
  • [38] SCALING AND BIAS CODES FOR MODELING SPEAKER-ADAPTIVE DNN-BASED SPEECH SYNTHESIS SYSTEMS
    Hieu-Thi Luong
    Yamagishi, Junichi
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 610 - 617
  • [39] Articulatory differences between oral and nasal vowels based on simulation of a speaker-adaptive articulatory model
    Rong, Panying
    Shosted, Ryan
    Kuehn, David
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2697 - 2700
  • [40] Meta-Learning Online Adaptation of Language Models
    Hu, Nathan
    Mitchell, Eric
    Manning, Christopher D.
    Finn, Chelsea
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 4418 - 4432