Analysis and recognition of whispered speech

被引：121

作者：

Ito, T ^{[1
]}

Takeda, K ^{[1
]}

Itakura, F ^{[1
]}

机构：

[1] Nagoya Univ, Grad Sch Engn, Nagoya, Aichi 4648603, Japan

来源：

SPEECH COMMUNICATION | 2005年 / 45卷 / 02期

关键词：

speech recognition; whispered speech; telephone handset; noise robustness;

D O I：

10.1016/j.specom.2003.10.005

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this study, we have examined the acoustic characteristics of whispered speech and addressed some of the issues involved in recognition of whispered speech used for communication over a mobile phone in a noisy environment. The acoustic analysis shows that there is an upward shift of formant frequencies of vowels as observed in the whispered speech data compared to the normal speech data. Voiced consonants in the whispered speech have lower energy at low frequencies up to 1.5 kHz and their spectral flatness is greater compared to the normal speech. In experiments on whispered speech recognition, results of our studies on adaptation of the whispered speech models have shown that adaptation using a small amount of whispered speech data from a target speaker can be effectively used for recognition of the whispered speech. In a noisy environment, the recognition accuracy decreases significantly for the whispered speech compared to the normal speaking of the same speech. A method to increase the SNR by covering the mouth with a hand has been shown to give a higher recognition accuracy for the whispered speech frequently encountered for private communication in a noisy environment. (C) 2004 Elsevier B.V. All rights reserved.

引用

页码：139 / 152

页数：14

共 50 条

[1] Acoustic analysis and recognition of whispered speech
Itoh, T
Takeda, K
Itakura, F
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 389 - 392
[2] Acoustic analysis and recognition of whispered speech
Itoh, T
Takeda, K
Itakura, F
ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 429 - 432
[3] Study on the Emotion Recognition of Whispered Speech
Jin, Yun
Zhao, Yan
Huang, Chengwei
Zhao, Li
PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL III, 2009, : 242 - 246
[4] Tone Recognition of Chinese Whispered Speech
Gong Chenghui
Zhao Heming
PACIIA: 2008 PACIFIC-ASIA WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION, VOLS 1-3, PROCEEDINGS, 2008, : 401 - +
[5] RECOGNITION OF WORD TONES IN WHISPERED SPEECH
JENSEN, MK
WORD-JOURNAL OF THE INTERNATIONAL LINGUISTIC ASSOCIATION, 1958, 14 (2-3): : 187 - 196
[6] Maturation of Speech-in-Speech Recognition for Whispered and Voiced Speech
Buss, Emily
Miller, Margaret K.
Leibold, Lori J.
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2022, 65 (08): : 3117 - 3128
[7] The Recognition of Whispered Speech in Real-Time
Hendrickson, Kristi
Ernest, Danielle
EAR AND HEARING, 2022, 43 (02): : 554 - 562
[8] Performance Analysis of Mandarin Whispered Speech Recognition Based on Normal Speech Training Model
Chen Xueqin
Zhao Heming
Fan Xiaohe
2016 SIXTH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2016, : 548 - 551
[9] Mandarin Connected Digits Recognition for Whispered Speech
Ru Tingting
Xie Xiang
Yin Hui
Kuang Jingming
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1141 - 1144
[10] HTK-Based Recognition of Whispered Speech
Galic, Jovan
Jovicic, Slobodan T.
Grozdic, Dorde
Markovic, Branko
SPEECH AND COMPUTER, 2014, 8773 : 251 - 258

← 1 2 3 4 5 →