MDL-based context-dependent subword modeling for speech recognition

被引:0
|
作者
Shinoda, Koichi [1 ]
Watanabe, Takao [1 ]
机构
[1] NEC Corp, Kawasaki, Japan
关键词
Markov processes - Mathematical models - Maximum likelihood estimation - Pattern recognition systems - Speech analysis;
D O I
暂无
中图分类号
学科分类号
摘要
Context-dependent phone units, such as triphones, have recently come to be used to model subword units in speech recognition systems that are based on the use of hidden Markov models (HMMs). While most such systems employ clustering of the HMM parameters (e.g., subword clustering and state clustering) to control the HMM size, so as to avoid poor recognition accuracy due to a lack of training data, none of them provide any effective criteria for determining the optimal number of clusters. This paper proposes a method in which state clustering is accomplished by way of phonetic decision trees and in which the minimum description length (MDL) criterion is used to optimize the number of clusters. Large-vocabulary Japanese-language recognition experiments show that this method achieves higher accuracy than the maximum-likelihood approach.
引用
收藏
页码:79 / 86
相关论文
共 50 条
  • [1] A frame-based context-dependent acoustic modeling for speech recognition
    Terashima R.
    Zen H.
    Nankaku Y.
    Tokuda K.
    [J]. IEEJ Transactions on Electronics, Information and Systems, 2010, 130 (10) : 1856 - 1864+24
  • [2] Regression-Based Context-Dependent Modeling of Deep Neural Networks for Speech Recognition
    Wang, Guangsen
    Sim, Khe Chai
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (11) : 1660 - 1669
  • [3] Context-dependent acoustic modeling using graphemes for large vocabulary speech recognition
    Kanthak, S
    Ney, H
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 845 - 848
  • [4] Hybrid methodological approach to context-dependent speech recognition
    Miskovic, Dragisa
    Gnjatovic, Milan
    Strbac, Perica
    Trenkic, Branimir
    Jakovljevic, Niksa
    Delic, Vlado
    [J]. INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2017, 14 (01):
  • [5] Context-dependent quantization for distributed and/or robust speech recognition
    Wan, Chia-Yu
    Chen, Yi
    Lee, Lin-Shan
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4413 - 4416
  • [6] Context-dependent acoustic models for Chinese speech recognition
    Ma, B
    Huang, TY
    Xu, B
    Zhang, XJ
    Qu, F
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 455 - 458
  • [7] Subword Modeling for Automatic Speech Recognition
    Livescu, Karen
    Fosler-Lussier, Eric
    Metze, Florian
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 44 - 57
  • [8] WithYou: Automated Adaptive Speech Tutoring With Context-Dependent Speech Recognition
    Zhang, Xinlei
    Miyaki, Takashi
    Rekimoto, Jun
    [J]. PROCEEDINGS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'20), 2020,
  • [9] Modeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese
    Wu, JJX
    Deng, L
    Chan, J
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2281 - 2284
  • [10] Integration of context-dependent durational knowledge into HMM-based speech recognition
    Wang, X
    tenBosch, LFM
    Pols, LCW
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1073 - 1076