'Early recognition' of polysyllabic words in continuous speech

被引:1
|
作者
Scharenborg, Odette [1 ]
ten Bosch, Louis [1 ]
Boves, Lou [1 ]
机构
[1] Univ Nijmegen, CLST, NL-6500 HD Nijmegen, Netherlands
来源
COMPUTER SPEECH AND LANGUAGE | 2007年 / 21卷 / 01期
关键词
D O I
10.1016/j.csl.2005.12.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Humans are able to recognise a word before its acoustic realisation is complete. This in contrast to conventional automatic speech recognition (ASR) systems, which compute the likelihood of a number of hypothesised word sequences, and identify the words that were recognised on the basis of a trace back of the hypothesis with the highest eventual score, in order to maximise efficiency and performance. In the present paper, we present an ASR system, SpeM, based on principles known from the field of human word recognition that is able to model the human capability of 'early recognition' by computing word activation scores (based on negative log likelihood scores) during the speech recognition process. Experiments on 1463 polysyllabic words in 885 utterances showed that 64.0% (936) of these polysyllabic words were recognised correctly at the end of the utterance. For 81.1% of the 936 correctly recognised polysyllabic words the local word activation allowed us to identify the word before its last phone was available, and 64.1% of those words were already identified one phone after their lexical uniqueness point. We investigated two types of predictors for deciding whether a word is considered as recognised before the end of its acoustic realisation. The first type is related to the absolute and relative values of the word activation, which trade false acceptances for false rejections. The second type of predictor is related to the number of phones of the word that have already been processed and the number of phones that remain until the end of the word. The results showed that SpeM's performance increases if the amount of acoustic evidence in support of a word increases and the risk of future mismatches decreases. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:54 / 71
页数:18
相关论文
共 50 条
  • [1] 'Early recognition' of words in continuous speech
    Scharenborg, O
    ten Bosch, L
    Boves, L
    [J]. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 61 - 66
  • [2] Constraints on the recognition of words in continuous speech
    McQueen, JM
    [J]. INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2000, 35 (3-4) : 39 - 39
  • [3] PROCESSING UNKNOWN WORDS IN CONTINUOUS SPEECH RECOGNITION
    KITA, K
    EHARA, T
    MORIMOTO, T
    [J]. IEICE TRANSACTIONS ON COMMUNICATIONS ELECTRONICS INFORMATION AND SYSTEMS, 1991, 74 (07): : 1811 - 1816
  • [4] Spoken Word Recognition of Chinese Words in Continuous Speech
    Yip, Michael C. W.
    [J]. JOURNAL OF PSYCHOLINGUISTIC RESEARCH, 2015, 44 (06) : 775 - 787
  • [5] Spoken Word Recognition of Chinese Words in Continuous Speech
    Michael C. W. Yip
    [J]. Journal of Psycholinguistic Research, 2015, 44 : 775 - 787
  • [6] The effect of linking phenomena (enchainment) on the recognition of words in continuous speech
    YersinBesson, C
    Grosjean, F
    [J]. ANNEE PSYCHOLOGIQUE, 1996, 96 (01): : 9 - 30
  • [7] The Hardware Accelerator of The Automatic Speech Recognition for The Continuous Korean Words
    Kim, Juyeob
    Kim, Yunjoo
    Kim, Wonjong
    Lee, Joohyun
    [J]. 2015 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC), 2015, : 213 - 214
  • [8] The contribution of polysyllabic words in clinical decision making about children's speech
    James, Deborah G. H.
    Van Doorn, Jan
    McLeod, Sharynne
    [J]. CLINICAL LINGUISTICS & PHONETICS, 2008, 22 (4-5) : 345 - 353
  • [9] Role of syllables and function-words in continuous Portuguese speech recognition
    dos Santos, SC
    Alcaim, A
    [J]. ELECTRONICS LETTERS, 2000, 36 (12) : 1083 - 1085
  • [10] Part of Speech Tagging Approach to Designing Compound Words for Arabic Continuous Speech Recognition Systems
    AbuZeina, Dia
    Elshafei, Moustafa
    Al-Khatib, Wasfi
    [J]. INFORMATICS ENGINEERING AND INFORMATION SCIENCE, PT IV, 2011, 254 : 330 - 338