Prosodic knowledge sources for automatic speech recognition

被引:0
|
作者
Vergyri, D [1 ]
Stolcke, A [1 ]
Gadde, VRR [1 ]
Ferrer, L [1 ]
Shriberg, E [1 ]
机构
[1] SRI Int, Speech Technol & Res Lab, Menlo Pk, CA 94025 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this work, different prosodic knowledge sources are integrated into a state-of-the-art large vocabulary speech recognition system. Prosody manifests itself on different levels in the speech signal: within the words as a change in phone durations and pitch, inbetween the words as a variation in the pause length, and beyond the words, correlating with higher linguistic structures and nonlexical phenomena. We investigate three models, each exploiting a different level of prosodic information, in rescoring N-best hypotheses according to how well recognized words correspond to prosodic features of the utterance. Experiments on the Switchboard corpus show word accuracy improvements with each prosodic knowledge source. A further improvement is observed with the combination of all models, demonstrating that they each capture somewhat different prosodic characteristics of the speech signal.
引用
收藏
页码:208 / 211
页数:4
相关论文
共 50 条
  • [31] AUTOMATIC SPEECH RECOGNITION OF IMPAIRED SPEECH
    CARLSON, GS
    BERNSTEIN, J
    [J]. INTERNATIONAL JOURNAL OF REHABILITATION RESEARCH, 1988, 11 (04) : 396 - 398
  • [32] HIGH-LEVEL KNOWLEDGE SOURCES IN USABLE SPEECH RECOGNITION SYSTEMS
    YOUNG, SR
    HAUPTMANN, AG
    WARD, WH
    SMITH, ET
    WERNER, P
    [J]. COMMUNICATIONS OF THE ACM, 1989, 32 (02) : 183 - 194
  • [33] ROLE OF PROSODIC FEATURES ON CHILDREN'S SPEECH RECOGNITION
    Kathania, Hemant K.
    Shahnawazuddin, S.
    Adiga, Nagaraj
    Ahmad, Waquar
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5519 - 5523
  • [34] Prosodic Modeling in Large Vocabulary Mandarin Speech Recognition
    Huang, Jui-Ting
    Lee, Lin-shan
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1241 - 1244
  • [35] PROSODIC MODELS, AUTOMATIC SPEECH UNDERSTANDING, AND SPEECH SYNTHESIS: TOWARDS THE COMMON GROUND?
    Batliner, Anton
    Moebius, Bernd
    [J]. INTEGRATION OF PHONETIC KNOWLEDGE IN SPEECH TECHNOLOGY, 2005, 25 : 21 - 44
  • [36] Automatic Speech Recognition System for Malay Speaking Children Automatic Speech Recognition system
    Rahman, Feisal Dani
    Mohamed, Noraini
    Mustafa, Mumtaz Begum
    Salim, Siti Salwah
    [J]. 2014 THIRD ICT INTERNATIONAL STUDENT PROJECT CONFERENCE (ICT-ISPC), 2014, : 79 - 82
  • [37] Attention and Feature Selection for Automatic Speech Emotion Recognition Using Utterance and Syllable-Level Prosodic Features
    Starlet Ben Alex
    Leena Mary
    Ben P. Babu
    [J]. Circuits, Systems, and Signal Processing, 2020, 39 : 5681 - 5709
  • [38] Enhancing Automatic Speech Recognition for Punjabi Dialects: An Experimental Analysis of Incorporating Prosodic Features and Acoustic Variability Mitigation
    Vivek Bhardwaj
    Tanya Gera
    Deepak Thakur
    Amitoj Singh
    [J]. SN Computer Science, 5 (6)
  • [39] Combining Manual and Automatic Prosodic Annotation for Expressive Speech Synthesis
    Brognaux, Sandrine
    Francois, Thomas
    Saerens, Marco
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3872 - 3879
  • [40] Acoustic Models for the Automatic Identification of Prosodic Boundaries in Spontaneous Speech
    Falcao Teixeira, Barbara Heloha
    Mittmann, Maryuale Malvessi
    [J]. REVISTA DE ESTUDOS DA LINGUAGEM, 2018, 26 (04) : 1455 - 1488