An Innovative Prosody Modeling Method for Chinese Speech Recognition

被引：0

作者：

Gang Peng

William S.-Y. Wang

机构：

[1] City University of Hong Kong,Language Engineering Laboratory, Department of Electronic Engineering

来源：

International Journal of Speech Technology | 2004年 / 7卷 / 2-3期

关键词：

Chinese dialects; speech recognition; prosody modeling; context-dependent;

D O I：

10.1023/B:IJST.0000017013.70486.51

中图分类号：

学科分类号：

摘要：

This paper presents an innovative method for prosody modeling in Chinese speech recognition. Our method first evaluated the reliability of the prosodic information by which the recognition system dynamically tunes the balance between the spectral scores and prosodic scores. The basic idea of this method is to use prosodic knowledge based on its reliability. The higher the reliability, the more the prosodic information contributes to recognition. Thus, this method will not introduce extra errors but will incorporate more knowledge into the recognition system. Experimental results showed that this method reduced the relative word error rate by as much as 52.9% and 46.0% for Mandarin and Cantonese digit string recognition tasks, respectively. When incorporating tone information into Cantonese Large Vocabulary Continuous Speech Recognition (LVCSR) via the proposed method, a 20.16% relative character error rate reduction was obtained.

引用

页码：129 / 140

页数：11

共 50 条

[1] Prosody modeling for automatic speech recognition and understanding
Shriberg, E
Stolcke, A
[J]. MATHEMATICAL FOUNDATIONS OF SPEECH AND LANGUAGE PROCESSING, 2004, 138 : 105 - 114
[2] Prosody-dependent Acoustic Modeling for Mandarin Speech Recognition
Chiu, Tzu-Hsuan
Chiang, Chen-Yu
Liao, Yuan-Fu
Yang, Jyh-Her
Wang, Yih-Ru
Chen, Sin-Horng
[J]. PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 139 - 142
[3] A novel statistical language modeling method for continuous Chinese speech recognition
Tian, B
Tian, HX
Fu, Q
Yi, KC
[J]. ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1998, : 734 - 737
[4] Emotion Recognition in Chinese Natural Speech by Combining Prosody and Voice Quality Features
Zhang, Shiqing
[J]. ADVANCES IN NEURAL NETWORKS - ISNN 2008, PT 2, PROCEEDINGS, 2008, 5264 : 457 - 464
[5] Prosody Dependent Mandarin Speech Recognition
Ni, Chong-Jia
Liu, Wen-Ju
Xu, Bo
[J]. 2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 197 - 201
[6] PROSODY MODELING FOR MANDARIN EXCLAMATORY SPEECH
Jia, Huibin
Tao, Jianhua
[J]. ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 890 - 893
[7] Fluent speech prosody: Framework and modeling
Tseng, CY
Pin, SH
Lee, Y
Wang, HM
Chen, YC
[J]. SPEECH COMMUNICATION, 2005, 46 (3-4) : 284 - 309
[8] Unsupervised Adaptation of Categorical Prosody Models for Prosody Labeling and Speech Recognition
Ananthakrishnan, Sankaranarayanan
Narayanan, Shrikanth
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (01): : 138 - 149
[9] Automatic assessment of children’s oral reading using speech recognition and prosody modeling
Kamini Sabu
Preeti Rao
[J]. CSI Transactions on ICT, 2018, 6 (2) : 221 - 225
[10] Using prosody to improve automatic speech recognition
Vicsi, Klara
Szaszak, Gyoergy
[J]. SPEECH COMMUNICATION, 2010, 52 (05) : 413 - 426

← 1 2 3 4 5 →