N-Best-based unsupervised speaker adaptation for speech recognition

被引:15
|
作者
Matsui, T
Furui, S
机构
[1] Nippon Telegraph & Tel Corp, Human Interface Labs, Yokosuka, Kanagawa 239, Japan
[2] Tokyo Inst Technol, Meguro Ku, Tokyo 152, Japan
来源
COMPUTER SPEECH AND LANGUAGE | 1998年 / 12卷 / 01期
关键词
D O I
10.1006/csla.1997.0036
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an instantaneous speaker adaptation method that uses N-best decoding for continuous mixture-density hidden-Markov-model-based speech-recognition systems. This method is effective even for speakers whose decoding using speaker-independent (SI) models are error-prone and for whom speaker adaptation techniques are truly needed. In addition, smoothed estimation and utterance verification are introduced into this method. The smoothed estimation is based on the likelihood values for adapted models of word sequences obtained by N-best decoding and improves the performance of error-prone speakers, and the utterance verification technique reduces the amount of calculation required. Performance evaluation using connected-digit (four-digit strings) recognition experiments performed over actual telephone lines showed a reduction of 36.4% in the error rates of speakers whose decoding using SI models are error-prone. (C) 1998 Academic Press Limited.
引用
收藏
页码:41 / 50
页数:10
相关论文
共 50 条
  • [21] Adaptive systems for unsupervised speaker tracking and speech recognition
    Herbig, Tobias
    Gerl, Franz
    Minker, Wolfgang
    Haeb-Umbach, Reinhold
    [J]. EVOLVING SYSTEMS, 2011, 2 (03) : 199 - 214
  • [22] SPEAKER ADAPTATION IN A LIMITED SPEECH RECOGNITION SYSTEM
    MAKHOUL, J
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 1971, C 20 (09) : 1057 - &
  • [23] Speaker Adaptation on Myanmar Spontaneous Speech Recognition
    Naing, Hay Mar Soe
    Pa, Win Pa
    [J]. COMPUTATIONAL LINGUISTICS, PACLING 2017, 2018, 781 : 303 - 313
  • [24] XMLLR for Improved Speaker Adaptation in Speech Recognition
    Povey, Daniel
    Kuo, Hong-Kwang J.
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1705 - +
  • [25] Quick fMLLR for speaker adaptation in speech recognition
    Varadarajan, Balakrishnan
    Povey, Daniel
    Chu, Stephen M.
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4297 - +
  • [26] DOMAIN AND SPEAKER ADAPTATION FOR CORTANA SPEECH RECOGNITION
    Zhao, Yong
    Li, Jinyu
    Zhang, Shixiong
    Chen, Liping
    Gong, Yifan
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5984 - 5988
  • [27] Analysis on MAP and MLLR Based Speaker Adaptation Techniques in Speech Recognition
    Ramya, T.
    Christina, Lilly S.
    Vijayalakshmi, P.
    Nagarajan, T.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON CIRCUIT, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2014), 2014, : 1753 - 1758
  • [28] Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
    Deng, Jiajun
    Xie, Xurong
    Wang, Tianzi
    Cui, Mingyu
    Xue, Boyang
    Jin, Zengrui
    Li, Guinan
    Hu, Shujie
    Liu, Xunying
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1175 - 1190
  • [29] Unsupervised domain adaptation for speech recognition with unsupervised error correction
    Mai, Long
    Carson-Berndsen, Julie
    [J]. INTERSPEECH 2022, 2022, : 5120 - 5124
  • [30] Unsupervised Speaker Adaptation for DNN-based Speech Synthesis using Input Codes
    Takaki, Shinji
    Nishimura, Yoshikazu
    Yamagishi, Junichi
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 649 - 658