Unsupervised Lattice-based Acoustic Model Adaptation for Speaker-Dependent Conversational Telephone Speech Transcription

被引：0

作者：

Thambiratnam, K. ^{[1
]}

Seide, E. ^{[1
]}

机构：

[1] Microsoft Res Asia, 5F Sigma Ctr, Beijing 100080, Peoples R China

来源：

INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年

关键词：

Unsupervised Acoustic Model Adaptation; Conversational Speech Recognition;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper examines the application of lattice adaptation techniques to speaker-dependent models for the purpose of conversational telephone speech transcription. Given sufficient training data per speaker, it is feasible to build adapted speaker-dependent models using lattice MLLR and lattice MAP. Experiments on iterative and cascaded adaptation arc presented. Additionally various strategies for thresholding frame posteriors are investigated, and it is shown that accumulating statistics from the local best-confidence path is sufficient to achieve optimal adaptation. Overall, an iterative cascaded lattice system was able to reduce WER by 7.0% abs., which was a 0.8% abs. gain over transcript-based adaptation. Lattice adaptation reduced the unsupervised/supervised adaptation gap from 2.5% to 1.7%.

引用

页码：1567 / 1570

页数：4

共 50 条

[1] UNSUPERVISED SPEAKER ADAPTATION FOR TELEPHONE CALL TRANSCRIPTION
Wallace, R.
Thambiratnam, K.
Seide, F.
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4393 - +
[2] LATTICE-BASED UNSUPERVISED ACOUSTIC MODEL TRAINING
Fraga-Silva, Thiago
Gauvain, Jean-Luc
Lamel, Lori
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4656 - 4659
[3] Unsupervised intra-speaker variability compensation based on Gestalt and model adaptation in speaker verification with telephone speech
Yoma, Nestor Becerra
Garreton, Claudio
Molina, Carlos
Huenupan, Fernando
[J]. SPEECH COMMUNICATION, 2008, 50 (11-12) : 953 - 964
[4] Unsupervised speaker adaptation for speaker independent acoustic to articulatory speech inversion
Sivaraman, Ganesh
Mitra, Vikramjit
Nam, Hosung
Tiede, Mark
Espy-Wilson, Carol
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 146 (01): : 316 - 329
[5] Comparing Speaker-Dependent and Speaker-Adaptive Acoustic Models for Recognizing Dysarthric Speech
Rudzicz, Frank
[J]. ASSETS'07: PROCEEDINGS OF THE NINTH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY, 2007, : 255 - 256
[6] Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation
Zhang, Wen-Lin
Zhang, Wei-Qiang
Qu, Dan
Li, Bi-Cheng
[J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
[7] Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation
Wen-Lin Zhang
Wei-Qiang Zhang
Dan Qu
Bi-Cheng Li
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2014
[8] A speaker based unsupervised speech segmentation algorithm used in conversational speech
Chen, Yanxiang
Wang, Qiong
[J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2007, 4798 : 396 - +
[9] Lattice-Based Risk Minimization Training for Unsupervised Language Model Adaptation
Kobayashi, Akio
Oku, Takahiro
Homma, Shinichi
Imai, Toru
Nakagawa, Seiichi
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1464 - +
[10] Lattice-based risk minimization training for unsupervised language model adaptation
Kobayashi, Akio
Oku, Takahiro
Homma, Shinichi
Imai, Toru
Nakagawa, Seiichi
[J]. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2011, : 1453 - 1456

← 1 2 3 4 5 →