On-line incremental speaker adaptation for broadcast news transcription

被引：7

作者：

Zhang, ZP

Furui, S

Ohtsuki, K

机构：

[1] Tokyo Inst Technol, Dept Comp Sci, Meguro Ku, Tokyo 1528552, Japan

[2] NTT Corp, Cyber Space Labs, Media Proc Project, Yokosuka, Kanagawa 2390847, Japan

来源：

SPEECH COMMUNICATION | 2002年 / 37卷 / 3-4期

关键词：

speaker adaptation; speaker-change detection; likelihood comparison; GMM (Gaussian mixture models); SA (speaker-adaptive) GMM;

D O I：

10.1016/S0167-6393(01)00018-8

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper describes a new unsupervised, on-line and incremental speaker adaptation technique that improves the performance of speech recognition systems when there are frequent changes in speaker identity and each speaker utters a series of several sentences. The speaker change is detected using speaker-in dependent (SI) and speaker-adaptive (SA) Gaussian mixture models (GMMs), and both phone hidden Markov model (HMM) and GMM are adapted by maximum likelihood linear regression (MLLR) transformation. Using this method, the word error rate of a broadcast news transcription task was reduced by 10.0% relative to the results using the SI models. (C) 2002 Elsevier Science B.V. All rights reserved.

引用

页码：271 / 281

页数：11

共 50 条

[1] An On-line Incremental Speaker Adaptation Technique for Audio Stream Transcription
Giuliani, Diego
Brugnara, Fabio
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3307 - 3311
[2] On-line incremental speaker adaptation with automatic speaker change detection
Zhang, ZP
Furui, S
Ohtsuki, K
[J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 961 - 964
[3] Study on Speaker Adaptation Methods in the Broadcast News Transcription Task
Cerva, Petr
Zdansky, Jindrich
Silovsky, Jan
Nouza, Jan
[J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2008, 5246 : 277 - 284
[4] ON-LINE SPEAKER ADAPTATION BASED EMOTION RECOGNITION USING INCREMENTAL EMOTIONAL INFORMATION
Kim, Jae-Bok
Park, Jeong-Sik
Oh, Yung-Hwan
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4948 - 4951
[5] Fast incremental clustering of Gaussian mixture speaker models for scaling up retrieval in on-line broadcast
Rougui, J. E.
Rziza, M.
Aboutajdine, D.
Gelgon, M.
Martinez, J.
[J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 5379 - 5382
[6] Incremental language modeling for automatic transcription of broadcast news
Ohtsuki, Katsutoshi
Nguyen, Long
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (02): : 526 - 532
[7] On-line incremental adaptation for speaker verification using maximum likelihood estimates of CDHMM parameters
Yu, K
Mason, J
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1752 - 1755
[8] Statistical language model adaptation for Mandarin broadcast news transcription
Chen, B
Tsai, WH
Kuo, JW
[J]. 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 313 - 316
[9] Transcription of broadcast news - System robustness issues and adaptation techniques
Bakis, R
Chen, S
Gopalakrishnan, P
Gopinath, R
Maes, S
Polymenakos, L
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 711 - 714
[10] Domain Adaptation of a Broadcast News Transcription System for the Portuguese Parliament
Neves, Luis
Martins, Ciro
Meinedo, Hugo
Neto, Joao
[J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROCEEDINGS, 2008, 5190 : 163 - 171

← 1 2 3 4 5 →