MDL-based context-dependent subword modeling for speech recognition

被引：0

作者：

Shinoda, Koichi ^{[1
]}

Watanabe, Takao ^{[1
]}

机构：

[1] NEC Corp, Kawasaki, Japan

来源：

Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi) | 2000年 / 21卷 / 02期

关键词：

Markov processes - Mathematical models - Maximum likelihood estimation - Pattern recognition systems - Speech analysis;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Context-dependent phone units, such as triphones, have recently come to be used to model subword units in speech recognition systems that are based on the use of hidden Markov models (HMMs). While most such systems employ clustering of the HMM parameters (e.g., subword clustering and state clustering) to control the HMM size, so as to avoid poor recognition accuracy due to a lack of training data, none of them provide any effective criteria for determining the optimal number of clusters. This paper proposes a method in which state clustering is accomplished by way of phonetic decision trees and in which the minimum description length (MDL) criterion is used to optimize the number of clusters. Large-vocabulary Japanese-language recognition experiments show that this method achieves higher accuracy than the maximum-likelihood approach.

引用

页码：79 / 86

共 50 条

[1] A frame-based context-dependent acoustic modeling for speech recognition
Terashima R.
Zen H.
Nankaku Y.
Tokuda K.
IEEJ Transactions on Electronics, Information and Systems, 2010, 130 (10) : 1856 - 1864+24
[2] Regression-Based Context-Dependent Modeling of Deep Neural Networks for Speech Recognition
Wang, Guangsen
Sim, Khe Chai
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (11) : 1660 - 1669
[3] Context-dependent acoustic modeling using graphemes for large vocabulary speech recognition
Kanthak, S
Ney, H
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 845 - 848
[4] Hybrid methodological approach to context-dependent speech recognition
Miskovic, Dragisa
Gnjatovic, Milan
Strbac, Perica
Trenkic, Branimir
Jakovljevic, Niksa
Delic, Vlado
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2017, 14 (01):
[5] Context-dependent acoustic models for Chinese speech recognition
Ma, B
Huang, TY
Xu, B
Zhang, XJ
Qu, F
1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 455 - 458
[6] Context-dependent quantization for distributed and/or robust speech recognition
Wan, Chia-Yu
Chen, Yi
Lee, Lin-Shan
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4413 - 4416
[7] Subword Modeling for Automatic Speech Recognition
Livescu, Karen
Fosler-Lussier, Eric
Metze, Florian
IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 44 - 57
[8] WithYou: Automated Adaptive Speech Tutoring With Context-Dependent Speech Recognition
Zhang, Xinlei
Miyaki, Takashi
Rekimoto, Jun
PROCEEDINGS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'20), 2020,
[9] Modeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese
Wu, JJX
Deng, L
Chan, J
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2281 - 2284
[10] Integration of context-dependent durational knowledge into HMM-based speech recognition
Wang, X
tenBosch, LFM
Pols, LCW
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1073 - 1076

← 1 2 3 4 5 →