Analysis and recognition of whispered speech

被引:121
|
作者
Ito, T [1 ]
Takeda, K [1 ]
Itakura, F [1 ]
机构
[1] Nagoya Univ, Grad Sch Engn, Nagoya, Aichi 4648603, Japan
关键词
speech recognition; whispered speech; telephone handset; noise robustness;
D O I
10.1016/j.specom.2003.10.005
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this study, we have examined the acoustic characteristics of whispered speech and addressed some of the issues involved in recognition of whispered speech used for communication over a mobile phone in a noisy environment. The acoustic analysis shows that there is an upward shift of formant frequencies of vowels as observed in the whispered speech data compared to the normal speech data. Voiced consonants in the whispered speech have lower energy at low frequencies up to 1.5 kHz and their spectral flatness is greater compared to the normal speech. In experiments on whispered speech recognition, results of our studies on adaptation of the whispered speech models have shown that adaptation using a small amount of whispered speech data from a target speaker can be effectively used for recognition of the whispered speech. In a noisy environment, the recognition accuracy decreases significantly for the whispered speech compared to the normal speaking of the same speech. A method to increase the SNR by covering the mouth with a hand has been shown to give a higher recognition accuracy for the whispered speech frequently encountered for private communication in a noisy environment. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:139 / 152
页数:14
相关论文
共 50 条
  • [1] Acoustic analysis and recognition of whispered speech
    Itoh, T
    Takeda, K
    Itakura, F
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 389 - 392
  • [2] Acoustic analysis and recognition of whispered speech
    Itoh, T
    Takeda, K
    Itakura, F
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 429 - 432
  • [3] Study on the Emotion Recognition of Whispered Speech
    Jin, Yun
    Zhao, Yan
    Huang, Chengwei
    Zhao, Li
    PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL III, 2009, : 242 - 246
  • [4] Tone Recognition of Chinese Whispered Speech
    Gong Chenghui
    Zhao Heming
    PACIIA: 2008 PACIFIC-ASIA WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION, VOLS 1-3, PROCEEDINGS, 2008, : 401 - +
  • [5] RECOGNITION OF WORD TONES IN WHISPERED SPEECH
    JENSEN, MK
    WORD-JOURNAL OF THE INTERNATIONAL LINGUISTIC ASSOCIATION, 1958, 14 (2-3): : 187 - 196
  • [6] Maturation of Speech-in-Speech Recognition for Whispered and Voiced Speech
    Buss, Emily
    Miller, Margaret K.
    Leibold, Lori J.
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2022, 65 (08): : 3117 - 3128
  • [7] The Recognition of Whispered Speech in Real-Time
    Hendrickson, Kristi
    Ernest, Danielle
    EAR AND HEARING, 2022, 43 (02): : 554 - 562
  • [8] Performance Analysis of Mandarin Whispered Speech Recognition Based on Normal Speech Training Model
    Chen Xueqin
    Zhao Heming
    Fan Xiaohe
    2016 SIXTH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2016, : 548 - 551
  • [9] Mandarin Connected Digits Recognition for Whispered Speech
    Ru Tingting
    Xie Xiang
    Yin Hui
    Kuang Jingming
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1141 - 1144
  • [10] HTK-Based Recognition of Whispered Speech
    Galic, Jovan
    Jovicic, Slobodan T.
    Grozdic, Dorde
    Markovic, Branko
    SPEECH AND COMPUTER, 2014, 8773 : 251 - 258