Unsupervised Lattice-based Acoustic Model Adaptation for Speaker-Dependent Conversational Telephone Speech Transcription

被引:0
|
作者
Thambiratnam, K. [1 ]
Seide, E. [1 ]
机构
[1] Microsoft Res Asia, 5F Sigma Ctr, Beijing 100080, Peoples R China
关键词
Unsupervised Acoustic Model Adaptation; Conversational Speech Recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper examines the application of lattice adaptation techniques to speaker-dependent models for the purpose of conversational telephone speech transcription. Given sufficient training data per speaker, it is feasible to build adapted speaker-dependent models using lattice MLLR and lattice MAP. Experiments on iterative and cascaded adaptation arc presented. Additionally various strategies for thresholding frame posteriors are investigated, and it is shown that accumulating statistics from the local best-confidence path is sufficient to achieve optimal adaptation. Overall, an iterative cascaded lattice system was able to reduce WER by 7.0% abs., which was a 0.8% abs. gain over transcript-based adaptation. Lattice adaptation reduced the unsupervised/supervised adaptation gap from 2.5% to 1.7%.
引用
收藏
页码:1567 / 1570
页数:4
相关论文
共 50 条
  • [1] UNSUPERVISED SPEAKER ADAPTATION FOR TELEPHONE CALL TRANSCRIPTION
    Wallace, R.
    Thambiratnam, K.
    Seide, F.
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4393 - +
  • [2] LATTICE-BASED UNSUPERVISED ACOUSTIC MODEL TRAINING
    Fraga-Silva, Thiago
    Gauvain, Jean-Luc
    Lamel, Lori
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4656 - 4659
  • [3] Unsupervised intra-speaker variability compensation based on Gestalt and model adaptation in speaker verification with telephone speech
    Yoma, Nestor Becerra
    Garreton, Claudio
    Molina, Carlos
    Huenupan, Fernando
    [J]. SPEECH COMMUNICATION, 2008, 50 (11-12) : 953 - 964
  • [4] Unsupervised speaker adaptation for speaker independent acoustic to articulatory speech inversion
    Sivaraman, Ganesh
    Mitra, Vikramjit
    Nam, Hosung
    Tiede, Mark
    Espy-Wilson, Carol
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 146 (01): : 316 - 329
  • [5] Comparing Speaker-Dependent and Speaker-Adaptive Acoustic Models for Recognizing Dysarthric Speech
    Rudzicz, Frank
    [J]. ASSETS'07: PROCEEDINGS OF THE NINTH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY, 2007, : 255 - 256
  • [6] Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation
    Zhang, Wen-Lin
    Zhang, Wei-Qiang
    Qu, Dan
    Li, Bi-Cheng
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
  • [7] Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation
    Wen-Lin Zhang
    Wei-Qiang Zhang
    Dan Qu
    Bi-Cheng Li
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2014
  • [8] A speaker based unsupervised speech segmentation algorithm used in conversational speech
    Chen, Yanxiang
    Wang, Qiong
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2007, 4798 : 396 - +
  • [9] Lattice-Based Risk Minimization Training for Unsupervised Language Model Adaptation
    Kobayashi, Akio
    Oku, Takahiro
    Homma, Shinichi
    Imai, Toru
    Nakagawa, Seiichi
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1464 - +
  • [10] Lattice-based risk minimization training for unsupervised language model adaptation
    Kobayashi, Akio
    Oku, Takahiro
    Homma, Shinichi
    Imai, Toru
    Nakagawa, Seiichi
    [J]. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2011, : 1453 - 1456