UNSUPERVISED SPEAKER ADAPTATION FOR TELEPHONE CALL TRANSCRIPTION

被引:4
|
作者
Wallace, R. [1 ]
Thambiratnam, K. [2 ]
Seide, F. [2 ]
机构
[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld, Australia
[2] Microsoft Res Asia, Beijing 100080, Peoples R China
关键词
Speaker adaptation; acoustic model adaptation; language model adaptation; unsupervised adaptation; speech recognition;
D O I
10.1109/ICASSP.2009.4960603
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The use of the PC and Internet for placing telephone calls will present new opportunities to capture vast amounts of un-transcribed speech for a particular speaker. This paper investigates how to best exploit this data for speaker-dependent speech recognition. Supervised and unsupervised experiments in acoustic model and language model adaptation are presented. Using one hour of automatically transcribed speech per speaker with a word error rate of 36.0%, unsupervised adaptation resulted in an absolute gain of 6.3%, equivalent to 70% of the gain from the supervised case, with additional adaptation data likely to yield further improvements. LM adaptation experiments suggested that although there seems to be a small degree of speaker idiolect, adaptation to the speaker alone, without considering the topic of the conversation, is in itself unlikely to improve transcription accuracy.
引用
收藏
页码:4393 / +
页数:2
相关论文
共 50 条
  • [1] Unsupervised Lattice-based Acoustic Model Adaptation for Speaker-Dependent Conversational Telephone Speech Transcription
    Thambiratnam, K.
    Seide, E.
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1567 - 1570
  • [2] Unsupervised speaker adaptation for phonetic transcription based voice dialing
    Kim, WG
    Jang, MS
    Lee, CH
    [J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 249 - 254
  • [3] Unsupervised speaker segmentation in telephone conversations
    Cohen, A
    Lapidus, V
    [J]. NINETEENTH CONVENTION OF ELECTRICAL AND ELECTRONICS ENGINEERS IN ISRAEL, 1996, : 102 - 105
  • [4] Unsupervised intra-speaker variability compensation based on Gestalt and model adaptation in speaker verification with telephone speech
    Yoma, Nestor Becerra
    Garreton, Claudio
    Molina, Carlos
    Huenupan, Fernando
    [J]. SPEECH COMMUNICATION, 2008, 50 (11-12) : 953 - 964
  • [5] Unsupervised speaker adaptation using reference speaker weighting
    Lai, Tsz-Chung
    Mak, Brian
    [J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 380 - +
  • [6] DIFFERENTIABLE POOLING FOR UNSUPERVISED SPEAKER ADAPTATION
    Swietojanski, Pawel
    Renals, Steve
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4305 - 4309
  • [7] An approach to robust unsupervised speaker adaptation
    Kim, NS
    Seo, DJ
    Lim, W
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (06) : 469 - 472
  • [8] Unsupervised model adaptation for speaker verification
    Preti, Alexandre
    Bonastre, Jean-Francois
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2090 - 2093
  • [9] Unsupervised Speaker Adaptation of DNN-HMM by Selecting Similar Speakers for Lecture Transcription
    Mimura, Masato
    Kawahara, Tatsuya
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [10] Using Unsupervised Feature-Based Speaker Adaptation for Improved Transcription of Spoken Archives
    Cerva, Petr
    Palecek, Karel
    Silovsky, Jan
    Nouza, Jan
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2576 - 2579