UNSUPERVISED SPEAKER ADAPTATION FOR TELEPHONE CALL TRANSCRIPTION

被引：4

作者：

Wallace, R. ^{[1
]}

Thambiratnam, K. ^{[2
]}

Seide, F. ^{[2
]}

机构：

[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld, Australia

[2] Microsoft Res Asia, Beijing 100080, Peoples R China

来源：

2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS | 2009年

关键词：

Speaker adaptation; acoustic model adaptation; language model adaptation; unsupervised adaptation; speech recognition;

D O I：

10.1109/ICASSP.2009.4960603

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The use of the PC and Internet for placing telephone calls will present new opportunities to capture vast amounts of un-transcribed speech for a particular speaker. This paper investigates how to best exploit this data for speaker-dependent speech recognition. Supervised and unsupervised experiments in acoustic model and language model adaptation are presented. Using one hour of automatically transcribed speech per speaker with a word error rate of 36.0%, unsupervised adaptation resulted in an absolute gain of 6.3%, equivalent to 70% of the gain from the supervised case, with additional adaptation data likely to yield further improvements. LM adaptation experiments suggested that although there seems to be a small degree of speaker idiolect, adaptation to the speaker alone, without considering the topic of the conversation, is in itself unlikely to improve transcription accuracy.

引用

页码：4393 / +

页数：2

共 50 条

[1] Unsupervised Lattice-based Acoustic Model Adaptation for Speaker-Dependent Conversational Telephone Speech Transcription
Thambiratnam, K.
Seide, E.
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1567 - 1570
[2] Unsupervised speaker adaptation for phonetic transcription based voice dialing
Kim, WG
Jang, MS
Lee, CH
[J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 249 - 254
[3] Unsupervised speaker segmentation in telephone conversations
Cohen, A
Lapidus, V
[J]. NINETEENTH CONVENTION OF ELECTRICAL AND ELECTRONICS ENGINEERS IN ISRAEL, 1996, : 102 - 105
[4] Unsupervised intra-speaker variability compensation based on Gestalt and model adaptation in speaker verification with telephone speech
Yoma, Nestor Becerra
Garreton, Claudio
Molina, Carlos
Huenupan, Fernando
[J]. SPEECH COMMUNICATION, 2008, 50 (11-12) : 953 - 964
[5] Unsupervised speaker adaptation using reference speaker weighting
Lai, Tsz-Chung
Mak, Brian
[J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 380 - +
[6] DIFFERENTIABLE POOLING FOR UNSUPERVISED SPEAKER ADAPTATION
Swietojanski, Pawel
Renals, Steve
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4305 - 4309
[7] An approach to robust unsupervised speaker adaptation
Kim, NS
Seo, DJ
Lim, W
[J]. IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (06) : 469 - 472
[8] Unsupervised model adaptation for speaker verification
Preti, Alexandre
Bonastre, Jean-Francois
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2090 - 2093
[9] Unsupervised Speaker Adaptation of DNN-HMM by Selecting Similar Speakers for Lecture Transcription
Mimura, Masato
Kawahara, Tatsuya
[J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
[10] Using Unsupervised Feature-Based Speaker Adaptation for Improved Transcription of Spoken Archives
Cerva, Petr
Palecek, Karel
Silovsky, Jan
Nouza, Jan
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2576 - 2579

← 1 2 3 4 5 →