Prosody modeling for automatic speech recognition and understanding

被引:0
|
作者
Shriberg, E [1 ]
Stolcke, A [1 ]
机构
[1] SRI Int, Menlo Pk, CA 94025 USA
关键词
prosody; speech recognition and understanding; hidden Markov models;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper summarizes statistical modeling approaches for the use of prosody (the rhythm and melody of speech) in automatic recognition and understanding of speech. We outline effective prosodic feature extraction, model architectures, and techniques to combine prosodic with lexical (word-based) information. We then survey a number of applications of the framework, and give results for automatic sentence segmentation and disfluency detection, topic segmentation, dialog act labeling, and word recognition.
引用
收藏
页码:105 / 114
页数:10
相关论文
共 50 条
  • [1] Using prosody to improve automatic speech recognition
    Vicsi, Klara
    Szaszak, Gyoergy
    SPEECH COMMUNICATION, 2010, 52 (05) : 413 - 426
  • [2] Automatic assessment of children’s oral reading using speech recognition and prosody modeling
    Kamini Sabu
    Preeti Rao
    CSI Transactions on ICT, 2018, 6 (2) : 221 - 225
  • [3] Using prosody to improve Mandarin automatic speech recognition
    Ni, Chong-Jia
    Liu, Wen-Ju
    Xu, Bo
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2698 - 2701
  • [4] Summarization of Spontaneous Speech using Automatic Speech Recognition and a Speech Prosody based Tokenizer
    Szaszak, Gyorgy
    Tundik, Mate Akos
    Beke, Andras
    KDIR: PROCEEDINGS OF THE 8TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT - VOL. 1, 2016, : 221 - 227
  • [5] Deep Learning in Acoustic Modeling for Automatic Speech Recognition and Understanding - An Overview -
    Gavat, Inge
    Militaru, Diana
    2015 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2015,
  • [6] Role of Prosody in Automatic Modality Recognition of Bang la Speech
    Warsi, Anal Hague
    Basu, Tulika
    Mazumdar, Debasis
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 674 - 677
  • [7] Template-based Automatic Speech Recognition meets Prosody
    Seppi, Dino
    Demuynck, Kris
    Van Compernolle, Dirk
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 552 - 555
  • [8] An Innovative Prosody Modeling Method for Chinese Speech Recognition
    Gang Peng
    William S.-Y. Wang
    International Journal of Speech Technology, 2004, 7 (2-3) : 129 - 140
  • [9] Prosody-dependent Acoustic Modeling for Mandarin Speech Recognition
    Chiu, Tzu-Hsuan
    Chiang, Chen-Yu
    Liao, Yuan-Fu
    Yang, Jyh-Her
    Wang, Yih-Ru
    Chen, Sin-Horng
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 139 - 142
  • [10] Auditory modeling in automatic recognition of speech
    Hermansky, H
    SIGNAL ANALYSIS & PREDICTION I, 1997, : 17 - 22