Integrated recognition of words and prosodic phrase boundaries

被引:20
|
作者
Gallwitz, F [1 ]
Niemann, H [1 ]
Nöth, E [1 ]
Warnke, V [1 ]
机构
[1] Univ Erlangen Nurnberg, Chair Pattern Recognit, D-91058 Erlangen, Germany
关键词
speech recognition; prosody; speech understanding;
D O I
10.1016/S0167-6393(01)00027-9
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we present an integrated approach for recognizing both the word sequence and the syntactic-prosodic structure of a spontaneous utterance. The approach aims at improving the performance of the understanding component of speech understanding systems by exploiting not only acoustic-phonetic and syntactic information, but also prosodic information directly within the speech recognition process. Whereas spoken utterances are typically modelled as unstructured word sequences in the speech recognizer, our approach includes phrase boundary information in the language model and provides HMMs to model the acoustic and prosodic characteristics of phrase boundaries. This methodology has two major advantages compared to purely word-based speech recognizers. First, additional syntactic-prosodic boundaries are determined by the speech recognizer which facilitates parsing and resolve syntactic and semantic ambiguities. Second - after having removed the boundary information from the result of the recognizer - the integrated model yields a 4% relative word error rate (WER) reduction compared to a traditional word recognizer. The boundary classification performance is equal to that of a separate prosodic classifier operating on the word recognizer output, thus making a separate classifier unnecessary for this task and saving the computation time involved. Compared to the baseline word recognizer, the integrated word-and-boundary recognizer does not involve any computational overhead. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:81 / 95
页数:15
相关论文
共 50 条
  • [11] A maximum entropy Markov model for prediction of prosodic phrase boundaries in Chinese TTS
    Zhao, Ziping
    Zhao, Tingjian
    Zhu, Yaoting
    GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, : 498 - 501
  • [12] Prosodic knowledge affects the recognition of newly acquired words
    Shatzman, KB
    McQueen, JM
    PSYCHOLOGICAL SCIENCE, 2006, 17 (05) : 372 - 377
  • [13] Assessing priming for prosodic representations: Speaking rate, intonational phrase boundaries, and pitch accenting
    Tooley, Kristen M.
    Konopka, Agnieszka E.
    Watson, Duane G.
    MEMORY & COGNITION, 2018, 46 (04) : 625 - 641
  • [14] Assessing priming for prosodic representations: Speaking rate, intonational phrase boundaries, and pitch accenting
    Kristen M. Tooley
    Agnieszka E. Konopka
    Duane G. Watson
    Memory & Cognition, 2018, 46 : 625 - 641
  • [15] Prosodic phrase and cues to parse it
    Huang, XJ
    Yang, YF
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2004, 39 (5-6) : 101 - 102
  • [16] Auditory temporal structure processing in dyslexia: processing of prosodic phrase boundaries is not impaired in children with dyslexia
    Eveline Geiser
    Margaret Kjelgaard
    Joanna A. Christodoulou
    Abigail Cyr
    John D. E. Gabrieli
    Annals of Dyslexia, 2014, 64 : 77 - 90
  • [17] Auditory temporal structure processing in dyslexia: processing of prosodic phrase boundaries is not impaired in children with dyslexia
    Geiser, Eveline
    Kjelgaard, Margaret
    Christodoulou, Joanna A.
    Cyr, Abigail
    Gabrieli, John D. E.
    ANNALS OF DYSLEXIA, 2014, 64 (01) : 77 - 90
  • [18] Automatic Determination of the Standard Chinese Prosodic Phrase Boundaries by F0 Generation Model
    Bu, Shehui
    Zhuo, Zhenjie
    Yang, Lingling
    Itahashi, Shuichi
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1400 - +
  • [19] Using Prosodic Phrase-Based VQVAE on Audio ALBERT for Speech Emotion Recognition
    Hsu, Jia-Hao
    Wu, Chung-Hsien
    Yang, Tsung-Hsien
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 415 - 419
  • [20] Prosodic phrasing and the emergence of phrase structure
    Himmelmann, Nikolaus P.
    LINGUISTICS, 2022, 60 (03) : 715 - 743