Prosody modeling for automatic speech recognition and understanding

被引：0

作者：

Shriberg, E ^{[1
]}

Stolcke, A ^{[1
]}

机构：

[1] SRI Int, Menlo Pk, CA 94025 USA

来源：

MATHEMATICAL FOUNDATIONS OF SPEECH AND LANGUAGE PROCESSING | 2004年 / 138卷

关键词：

prosody; speech recognition and understanding; hidden Markov models;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper summarizes statistical modeling approaches for the use of prosody (the rhythm and melody of speech) in automatic recognition and understanding of speech. We outline effective prosodic feature extraction, model architectures, and techniques to combine prosodic with lexical (word-based) information. We then survey a number of applications of the framework, and give results for automatic sentence segmentation and disfluency detection, topic segmentation, dialog act labeling, and word recognition.

引用

页码：105 / 114

页数：10

共 50 条

[31] Trick the System: Towards understanding Automatic Speech Recognition Systems
Markert, Karla
ERCIM NEWS, 2021, (125): : 48 - 49
[32] Understanding Racial Disparities in Automatic Speech Recognition: the case of habitual "be"
Martin, Joshua L.
Tang, Kevin
INTERSPEECH 2020, 2020, : 626 - 630
[33] PROSODY MODELING FOR MANDARIN EXCLAMATORY SPEECH
Jia, Huibin
Tao, Jianhua
ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 890 - 893
[34] Fluent speech prosody: Framework and modeling
Tseng, CY
Pin, SH
Lee, Y
Wang, HM
Chen, YC
SPEECH COMMUNICATION, 2005, 46 (3-4) : 284 - 309
[35] Fluent speech prosody: Framework and modeling
Tseng, Chiu-Yu
Pin, Shao-Huang
Lee, Yehlin
Wang, Hsin-Min
Chen, Yong-Cheng
Speech Commun, 3-4 (284-309):
[36] Unsupervised Adaptation of Categorical Prosody Models for Prosody Labeling and Speech Recognition
Ananthakrishnan, Sankaranarayanan
Narayanan, Shrikanth
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (01): : 138 - 149
[37] Cerebral mechanisms for understanding emotional prosody in speech
Pell, MD
BRAIN AND LANGUAGE, 2006, 96 (02) : 221 - 234
[38] Important prosody characteristics for spontaneous speech recognition
Kleckova, J
Krutisova, J
Matousek, V
Schwarz, J
ICONIP'02: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING: COMPUTATIONAL INTELLIGENCE FOR THE E-AGE, 2002, : 717 - 721
[39] Automatic Speech Recognition for Uyghur through Multilingual Acoustic Modeling
Abulimiti, Ayimunishagu
Schultz, Tanja
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6444 - 6449
[40] Improving Language Modeling with an Adversarial Critic for Automatic Speech Recognition
Zhang, Yike
Zhang, Pengyuan
Yan, Yonghong
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3348 - 3352

← 1 2 3 4 5 →