Integrated recognition of words and prosodic phrase boundaries

被引:20
|
作者
Gallwitz, F [1 ]
Niemann, H [1 ]
Nöth, E [1 ]
Warnke, V [1 ]
机构
[1] Univ Erlangen Nurnberg, Chair Pattern Recognit, D-91058 Erlangen, Germany
关键词
speech recognition; prosody; speech understanding;
D O I
10.1016/S0167-6393(01)00027-9
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we present an integrated approach for recognizing both the word sequence and the syntactic-prosodic structure of a spontaneous utterance. The approach aims at improving the performance of the understanding component of speech understanding systems by exploiting not only acoustic-phonetic and syntactic information, but also prosodic information directly within the speech recognition process. Whereas spoken utterances are typically modelled as unstructured word sequences in the speech recognizer, our approach includes phrase boundary information in the language model and provides HMMs to model the acoustic and prosodic characteristics of phrase boundaries. This methodology has two major advantages compared to purely word-based speech recognizers. First, additional syntactic-prosodic boundaries are determined by the speech recognizer which facilitates parsing and resolve syntactic and semantic ambiguities. Second - after having removed the boundary information from the result of the recognizer - the integrated model yields a 4% relative word error rate (WER) reduction compared to a traditional word recognizer. The boundary classification performance is equal to that of a separate prosodic classifier operating on the word recognizer output, thus making a separate classifier unnecessary for this task and saving the computation time involved. Compared to the baseline word recognizer, the integrated word-and-boundary recognizer does not involve any computational overhead. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:81 / 95
页数:15
相关论文
共 50 条
  • [1] SEGMENTAL DURATIONS IN THE VICINITY OF PROSODIC PHRASE BOUNDARIES
    WIGHTMAN, CW
    SHATTUCKHUFNAGEL, S
    OSTENDORF, M
    PRICE, PJ
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1992, 91 (03): : 1707 - 1717
  • [3] Pause or No Pause?—Prosodic Phrase Boundaries Revisited
    郑秋豫
    张俊祥
    TsinghuaScienceandTechnology, 2008, (04) : 500 - 509
  • [4] Study on prediction of prosodic phrase boundaries in Chinese TTS
    Zhao, Ziping
    Zhu, Yaoting
    SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 354 - +
  • [5] Prosodic rules for the implementation of phrase boundaries in synthetic speech
    Sanderman, AA
    Collier, R
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (05): : 3390 - 3397
  • [6] On the weight of phrase-final prosodic words in a sign language
    Crasborn, Onno
    van der Kooij, Els
    Ros, Johan
    SIGN LANGUAGE & LINGUISTICS, 2012, 15 (01) : 11 - 38
  • [7] Prediction of prosodic phrase boundaries considering variable speaking rate
    Kim, YJ
    Oh, YH
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1505 - 1508
  • [9] PHRASE RECOGNITION IN CONVERSATIONAL SPEECH USING PROSODIC AND PHONEMIC INFORMATION
    OKAWA, S
    ENDO, T
    KOBAYASHI, T
    SHIRAI, K
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1993, E76D (01) : 44 - 50
  • [10] Investigating Effect of Rich Syntactic Features on Mandarin Prosodic Phrase Boundaries Prediction
    Che, Hao
    Wen, Zhengqi
    Li, Ya
    Tao, Jianhua
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 501 - 505