The Recognition of Whispered Speech in Real-Time

被引:5
|
作者
Hendrickson, Kristi [1 ,2 ]
Ernest, Danielle [1 ]
机构
[1] Univ Iowa, Dept Commun Sci & Disorders, 250 Hawkins Dr, Iowa City, IA 52240 USA
[2] Univ Iowa, Dept Psychol & Brain Sci, 250 Hawkins Dr, Iowa City, IA 52240 USA
来源
EAR AND HEARING | 2022年 / 43卷 / 02期
关键词
Competition; Eye tracking; Lexical; Speech perception; Whispered speech; Word recognition; SPOKEN-WORD RECOGNITION; PERCEIVED PITCH; PERCEPTION; INFORMATION; LANGUAGE; FEATURES; VOWELS; NOISE; MODEL;
D O I
10.1097/AUD.0000000000001114
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Objectives: Whispered speech offers a unique set of challenges to speech perception and word recognition. The goals of the present study were twofold: First, to determine how listeners recognize whispered speech. Second, to inform major theories of spoken word recognition by considering how recognition changes when major cues to phoneme identity are reduced or largely absent compared with normal voiced speech. Design: Using eye tracking in the Visual World Paradigm, we examined how listeners recognize whispered speech. After hearing a target word (normal or whispered), participants selected the corresponding image from a display of four-a target (e.g., money), a word that shares sounds with the target at the beginning (cohort competitor, e.g., mother), a word that shares sounds with the target at the end (rhyme competitor, e.g., honey), and a phonologically unrelated word (e.g., whistle). Eye movements to each object were monitored to measure (1) how fast listeners process whispered speech, and (2) how strongly they consider lexical competitors (cohorts and rhymes) as the speech signal unfolds. Results: Listeners were slower to recognize whispered words. Compared with normal speech, listeners displayed slower reaction times to click the target image, were slower to fixate the target, and fixated the target less overall. Further, we found clear evidence that the dynamics of lexical competition are altered during whispered speech recognition. Relative to normal speech, words that overlapped with the target at the beginning (cohorts) displayed slower, reduced, and delayed activation, whereas words that overlapped with the target at the end (rhymes) exhibited faster, more robust, and longer lasting activation. Conclusion: When listeners are confronted with whispered speech, they engage in a "wait-and-see" approach. Listeners delay lexical access, and by the time they begin to consider what word they are hearing, the beginning of the word has largely come and gone, and activation for cohorts is reduced. However, delays in lexical access actually increase consideration of rhyme competitors; the delay pushes lexical activation to a point later in processing, and the recognition system puts more weight on the word-final overlap between the target and the rhyme.
引用
收藏
页码:554 / 562
页数:9
相关论文
共 50 条
  • [31] Real-Time Speaker Adaptation for Speech Recognition on Mobile Devices
    Lee, Gil Ho
    [J]. 2010 7TH IEEE CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE-CCNC 2010, 2010, : 403 - 404
  • [32] Real-Time Speech Emotion Recognition by Minimum Number of Features
    Savargiv, Mohammad
    Bastanfard, Azam
    [J]. 2016 ARTIFICIAL INTELLIGENCE AND ROBOTICS (IRANOPEN), 2016, : 72 - 76
  • [33] RECOGNITION OF SPEECH IN REAL TIME
    FIEVET, F
    MAISSIS, A
    WALRAVE, P
    [J]. AUTOMATISME, 1970, 15 (01): : 3 - &
  • [34] Mandarin Connected Digits Recognition for Whispered Speech
    Ru Tingting
    Xie Xiang
    Yin Hui
    Kuang Jingming
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1141 - 1144
  • [35] HTK-Based Recognition of Whispered Speech
    Galic, Jovan
    Jovicic, Slobodan T.
    Grozdic, Dorde
    Markovic, Branko
    [J]. SPEECH AND COMPUTER, 2014, 8773 : 251 - 258
  • [36] Using Speech Recognition for Real-Time Captioning and Lecture Transcription in the Classroom
    Ranchal, Rohit
    Taber-Doughty, Teresa
    Guo, Yiren
    Bain, Keith
    Martin, Heather
    Robinson, J. Paul
    Duerstock, Bradley S.
    [J]. IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2013, 6 (04): : 299 - 311
  • [37] Speech recognition real-time dictation of reports during endoscopic procedures
    Kaufman, PN
    Reimer, MA
    [J]. GASTROINTESTINAL ENDOSCOPY, 1999, 49 (04) : AB119 - AB119
  • [38] On real-time mean-and-variance normalization of speech recognition features
    Pujol, Pere
    Macho, Dusan
    Nadeu, Climent
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 773 - 776
  • [39] On-the-fly Lattice Rescoring for Real-time Automatic Speech Recognition
    Sak, Hasim
    Saraclar, Murat
    Gungor, Tunga
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2450 - +
  • [40] Real-time telephone-based speech recognition in the JUPITER domain
    Glass, JR
    Hazen, TJ
    Hetherington, IL
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 61 - 64