Speaker adaptation techniques for speech recognition using probabilistic models

被引:3
|
作者
Shinoda, K [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo 1138656, Japan
关键词
speech recognition; speaker adaptation; hidden Markov model;
D O I
10.1002/ecjc.20207
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In speech recognition, speaker adaptation refers to the range of techniques whereby a speech recognition system is adapted to the acoustic features of a specific user using a small sample of utterances from that user. In recent years the practical development of speaker-independent speech recognition systems using continuous density hidden Markov models has seen significant progress; however, the recognition performance of these systems has not yet reached that of speaker-dependent speech recognition systems in which a user's speech is registered beforehand. Much hope has therefore been placed on the establishment of speaker adaptation techniques that can bring performance of a speaker-independent system Lip to that of a speaker-dependent one using the smallest amounts of data. In this paper we present a survey of previous research into speaker adaptation techniques focusing particularly on three important approaches in this area: maximum a posteriori (MAP) parameter estimation, maximum likelihood linear regression (MLLR), and eigenvoices. We also discuss approaches that combine these techniques in a lateral fashion. (C) 2005 Wiley Periodicals, Inc.
引用
收藏
页码:25 / 42
页数:18
相关论文
共 50 条
  • [21] A speaker clustering algorithm for fast speaker adaptation in continuous speech recognition
    Rodríguez, LJ
    Torres, MI
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 433 - 440
  • [22] Rapid and Effective Speaker Adaptation of Convolutional Neural Network Based Models for Speech Recognition
    Abdel-Hamid, Ossama
    Jiang, Hui
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1247 - 1251
  • [23] A Combined Speaker Adaptation Method for Mandarin Speech Recognition
    徐向华
    朱杰
    [J]. Journal of Shanghai Jiaotong University(Science), 2004, (04) : 21 - 24
  • [24] Online Speaker Adaptation Using Memory-Aware Networks for Speech Recognition
    Pan, Jia
    Wan, Genshun
    Du, Jun
    Ye, Zhongfu
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1025 - 1037
  • [25] Joint speaker and environment adaptation using Tensor Voice for robust speech recognition
    Jeong, Yongwon
    [J]. SPEECH COMMUNICATION, 2014, 58 : 1 - 10
  • [26] Adaptation of hidden Markov model for telephone speech recognition and speaker adaptation
    Chien, JT
    Wang, HC
    [J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1997, 144 (03): : 129 - 135
  • [27] Continuous speech recognition using an on-line speaker adaptation method based on automatic speaker clustering
    Zhang, W
    Nakagawa, S
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2003, E86D (03) : 464 - 473
  • [28] SPEAKER CONDITIONING OF ACOUSTIC MODELS USING AFFINE TRANSFORMATION FOR MULTI-SPEAKER SPEECH RECOGNITION
    Yousefi, Midia
    Hansen, John H. L.
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 283 - 288
  • [29] Probabilistic Latent Speaker Training for Large Vocabulary Speech Recognition
    Su, Dan
    Wu, Xihong
    Chi, Huisheng
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1225 - 1228
  • [30] Probabilistic Latent Speaker Analysis for Large Vocabulary Speech Recognition
    Su, Dan
    Wu, Xihong
    Chi, Huisheng
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1889 - 1892