Prosody modeling for automatic speech recognition and understanding

被引:0
|
作者
Shriberg, E [1 ]
Stolcke, A [1 ]
机构
[1] SRI Int, Menlo Pk, CA 94025 USA
关键词
prosody; speech recognition and understanding; hidden Markov models;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper summarizes statistical modeling approaches for the use of prosody (the rhythm and melody of speech) in automatic recognition and understanding of speech. We outline effective prosodic feature extraction, model architectures, and techniques to combine prosodic with lexical (word-based) information. We then survey a number of applications of the framework, and give results for automatic sentence segmentation and disfluency detection, topic segmentation, dialog act labeling, and word recognition.
引用
收藏
页码:105 / 114
页数:10
相关论文
共 50 条
  • [31] Trick the System: Towards understanding Automatic Speech Recognition Systems
    Markert, Karla
    ERCIM NEWS, 2021, (125): : 48 - 49
  • [32] Understanding Racial Disparities in Automatic Speech Recognition: the case of habitual "be"
    Martin, Joshua L.
    Tang, Kevin
    INTERSPEECH 2020, 2020, : 626 - 630
  • [33] PROSODY MODELING FOR MANDARIN EXCLAMATORY SPEECH
    Jia, Huibin
    Tao, Jianhua
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 890 - 893
  • [34] Fluent speech prosody: Framework and modeling
    Tseng, CY
    Pin, SH
    Lee, Y
    Wang, HM
    Chen, YC
    SPEECH COMMUNICATION, 2005, 46 (3-4) : 284 - 309
  • [35] Fluent speech prosody: Framework and modeling
    Tseng, Chiu-Yu
    Pin, Shao-Huang
    Lee, Yehlin
    Wang, Hsin-Min
    Chen, Yong-Cheng
    Speech Commun, 3-4 (284-309):
  • [36] Unsupervised Adaptation of Categorical Prosody Models for Prosody Labeling and Speech Recognition
    Ananthakrishnan, Sankaranarayanan
    Narayanan, Shrikanth
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (01): : 138 - 149
  • [37] Cerebral mechanisms for understanding emotional prosody in speech
    Pell, MD
    BRAIN AND LANGUAGE, 2006, 96 (02) : 221 - 234
  • [38] Important prosody characteristics for spontaneous speech recognition
    Kleckova, J
    Krutisova, J
    Matousek, V
    Schwarz, J
    ICONIP'02: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING: COMPUTATIONAL INTELLIGENCE FOR THE E-AGE, 2002, : 717 - 721
  • [39] Automatic Speech Recognition for Uyghur through Multilingual Acoustic Modeling
    Abulimiti, Ayimunishagu
    Schultz, Tanja
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6444 - 6449
  • [40] Improving Language Modeling with an Adversarial Critic for Automatic Speech Recognition
    Zhang, Yike
    Zhang, Pengyuan
    Yan, Yonghong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3348 - 3352