Integrated recognition of words and prosodic phrase boundaries

被引:20
|
作者
Gallwitz, F [1 ]
Niemann, H [1 ]
Nöth, E [1 ]
Warnke, V [1 ]
机构
[1] Univ Erlangen Nurnberg, Chair Pattern Recognit, D-91058 Erlangen, Germany
关键词
speech recognition; prosody; speech understanding;
D O I
10.1016/S0167-6393(01)00027-9
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we present an integrated approach for recognizing both the word sequence and the syntactic-prosodic structure of a spontaneous utterance. The approach aims at improving the performance of the understanding component of speech understanding systems by exploiting not only acoustic-phonetic and syntactic information, but also prosodic information directly within the speech recognition process. Whereas spoken utterances are typically modelled as unstructured word sequences in the speech recognizer, our approach includes phrase boundary information in the language model and provides HMMs to model the acoustic and prosodic characteristics of phrase boundaries. This methodology has two major advantages compared to purely word-based speech recognizers. First, additional syntactic-prosodic boundaries are determined by the speech recognizer which facilitates parsing and resolve syntactic and semantic ambiguities. Second - after having removed the boundary information from the result of the recognizer - the integrated model yields a 4% relative word error rate (WER) reduction compared to a traditional word recognizer. The boundary classification performance is equal to that of a separate prosodic classifier operating on the word recognizer output, thus making a separate classifier unnecessary for this task and saving the computation time involved. Compared to the baseline word recognizer, the integrated word-and-boundary recognizer does not involve any computational overhead. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:81 / 95
页数:15
相关论文
共 50 条
  • [31] The prosodic structure of early words
    Demuth, K
    SIGNAL TO SYNTAX: BOOTSTRAPPING FROM SPEECH TO GRAMMAR IN EARLY ACQUISITION, 1996, : 171 - 184
  • [32] The prosodic structure of function words
    Selkirk, E
    SIGNAL TO SYNTAX: BOOTSTRAPPING FROM SPEECH TO GRAMMAR IN EARLY ACQUISITION, 1996, : 187 - 213
  • [33] Nonlocal effects of prosodic boundaries
    Katy Carlson
    Charles Clifton
    Lyn Frazier
    Memory & Cognition, 2009, 37 : 1014 - 1025
  • [34] Types of phrase and order of the words
    不详
    LINGUA E STILE, 2015, 50 (01) : 161 - 161
  • [35] Prosodic cues to syntactic boundaries
    Inst of Psychology, The Chinese Acad of Sciences, Beijing, China
    Shengxue Xuebao, 5 (414-421):
  • [36] Prosodic structure between the prosodic word and the phonological phrase: Recursive nodes or an independent domain?
    Vigario, Marina
    LINGUISTIC REVIEW, 2010, 27 (04): : 485 - 530
  • [37] Prosodic boundaries in adjunct attachment
    Carlson, K
    Clifton, C
    Frazier, L
    JOURNAL OF MEMORY AND LANGUAGE, 2001, 45 (01) : 58 - 81
  • [38] Prosodic boundaries in alaryngeal speech
    van Rossum, M. A.
    Quene, H.
    Nooteboom, S. G.
    CLINICAL LINGUISTICS & PHONETICS, 2008, 22 (03) : 215 - 231
  • [39] Rule learning based Chinese prosodic phrase prediction
    Tao, JH
    Dong, HH
    Zhao, S
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 425 - 432
  • [40] Mongolian Prosodic Phrase Prediction using Suffix Segmentation
    Liu, Rui
    Bao, Feilong
    Gao, Guanglai
    Wang, Weihua
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2016, : 250 - 253