An Innovative Prosody Modeling Method for Chinese Speech Recognition

被引：0

作者：

Gang Peng

William S.-Y. Wang

机构：

[1] City University of Hong Kong,Language Engineering Laboratory, Department of Electronic Engineering

来源：

International Journal of Speech Technology | 2004年 / 7卷 / 2-3期

关键词：

Chinese dialects; speech recognition; prosody modeling; context-dependent;

D O I：

10.1023/B:IJST.0000017013.70486.51

中图分类号：

学科分类号：

摘要：

This paper presents an innovative method for prosody modeling in Chinese speech recognition. Our method first evaluated the reliability of the prosodic information by which the recognition system dynamically tunes the balance between the spectral scores and prosodic scores. The basic idea of this method is to use prosodic knowledge based on its reliability. The higher the reliability, the more the prosodic information contributes to recognition. Thus, this method will not introduce extra errors but will incorporate more knowledge into the recognition system. Experimental results showed that this method reduced the relative word error rate by as much as 52.9% and 46.0% for Mandarin and Cantonese digit string recognition tasks, respectively. When incorporating tone information into Cantonese Large Vocabulary Continuous Speech Recognition (LVCSR) via the proposed method, a 20.16% relative character error rate reduction was obtained.

引用

页码：129 / 140

页数：11

共 50 条

[21] Prosody modification for speech recognition in emotionally mismatched conditions
Vegesna, Vishnu Vidyadhara Raju
Gurugubelli, Krishna
Vuppala, Anil Kumar
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (03) : 521 - 532
[22] Summarization of Spontaneous Speech using Automatic Speech Recognition and a Speech Prosody based Tokenizer
Szaszak, Gyorgy
Tundik, Mate Akos
Beke, Andras
[J]. KDIR: PROCEEDINGS OF THE 8TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT - VOL. 1, 2016, : 221 - 227
[23] Prosody modeling and Eigen-Prosody analysis for robust speaker recognition
Chen, ZH
Liao, YF
Juang, YT
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 185 - 188
[24] Study of prosody model on Chinese speech synthesis based on the classification of syllabic prosody features
Tao, Jianhua
Cai, Lianhong
[J]. Shengxue Xuebao/Acta Acustica, 2003, 28 (05): : 395 - 402
[25] Speech Recognition with Word Fragment Detection Using Prosody Features for Spontaneous Speech
Yeh, Jui-Feng
Yen, Ming-Chi
[J]. APPLIED MATHEMATICS & INFORMATION SCIENCES, 2012, 6 (02): : 669S - 675S
[26] Prosody Usage Optimization for Children Speech Recognition with Zero Resource Children Speech
Li, Chenda
Qian, Yanmin
[J]. INTERSPEECH 2019, 2019, : 3446 - 3450
[27] An Automatic Prosody Labeling Method for Mandarin Speech
Chiang, Chen-Yu
Yu, Hsiu-Min
Wang, Yih-Ru
Chen, Sin-Horng
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 725 - +
[28] Articulatory-Functional Modeling of Speech Prosody: A Review
Xu, Yi
Prom-on, Santitham
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 46 - +
[29] Modeling prosody for language identification on read and spontaneous speech
Rouas, JL
Farinas, J
Pellegrino, F
André-Obrecht, R
[J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 753 - 756
[30] Modeling prosody for language identification on read and spontaneous speech
Rouas, JL
Farinas, J
Pellegrino, F
André-Obrecht, R
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 40 - 43

← 1 2 3 4 5 →