Long term on-line speaker adaptation for large vocabulary dictation

被引：0

作者：

Thelen, E

机构：

来源：

ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4 | 1996年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

On-line speaker adaptation is desirable for speech recognition dictation applications, because it offers the possibility to improve the system with the speaker-specific data obtained from the user. Since the user will work with such a device over a long period, for a dictation system the long term adaptation performance is more important than the adaptation speed. In contrast to speaker-dependent re-training, the speaker-specific speech data does not need to be stored for on-line speaker adaptation and each adaptation step does not require a large computational effort. In this paper we describe our way of performing online Bayesian speaker adaptation using partial traceback. We compare supervised with unsupervised adaptation and speaker adaptation with speaker-dependent training using the adaptation material. Compared to the speaker-independent startup models, the error rate was divided by two after five hours of supervised adaptation in our experiments, In the long term experiments, supervised on-line adaptation performed similar to speaker-dependent training using the adaptation material.

引用

页码：2139 / 2142

页数：4

共 50 条

[1] Rapid speaker adaptation for embedded large vocabulary dictation system with sparse training materials
Huang, Wei
Zhang, Yaxin
He, Xin
Bao, Qingfeng
[J]. 2008 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING, VOLS 1 AND 2, PROCEEDINGS, 2008, : 1069 - 1072
[2] A study on speaker adaptation of large vocabulary
Jeon, B
Kim, J
Hong, S
Kwon, Y
Lee, K
[J]. ISIE 2001: IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS PROCEEDINGS, VOLS I-III, 2001, : 513 - 515
[3] On-line incremental speaker adaptation with automatic speaker change detection
Zhang, ZP
Furui, S
Ohtsuki, K
[J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 961 - 964
[4] Iterative unsupervised speaker adaptation for batch dictation
Homma, S
Takahashi, J
Sagayama, S
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1141 - 1144
[5] CCLMDS'96: Towards a speaker-independent large-vocabulary Mandarin dictation system
Chiang, TH
Pengwu, CM
Chien, SC
Chang, CH
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1799 - 1802
[6] SPEAKER ADAPTATION IN A LARGE-VOCABULARY GAUSSIAN HMM RECOGNIZER
KENNY, P
LENNIG, M
MERMELSTEIN, P
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1990, 12 (09) : 917 - 920
[7] Experiments in speaker normalisation and adaptation for large vocabulary speech recognition
Pye, D
Woodland, PC
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1047 - 1050
[8] On-line incremental speaker adaptation for broadcast news transcription
Zhang, ZP
Furui, S
Ohtsuki, K
[J]. SPEECH COMMUNICATION, 2002, 37 (3-4) : 271 - 281
[9] Speaker clustering and transformation for speaker adaptation in large-vocabulary speech recognition systems
Padmanabhan, M
Bahl, LR
Nahamoo, D
Picheny, MA
[J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 701 - 704
[10] Speaker adaptation in the philips system for large vocabulary continuous speech recognition
Thelen, E
Aubert, X
Beyerlein, P
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1035 - 1038

← 1 2 3 4 5 →