Differences between acoustic characteristics of spontaneous and read speech and their effects on speech recognition performance

被引：53

作者：

Nakamura, Masanobu ^{[1
]}

Iwano, Koji ^{[1
]}

Furui, Sadaoki ^{[1
]}

机构：

[1] Tokyo Inst Technol, Dept Comp Sci, Meguro Ku, Tokyo 1528552, Japan

来源：

COMPUTER SPEECH AND LANGUAGE | 2008年 / 22卷 / 02期

关键词：

Corpus of Spontaneous Japanese; Spontaneous speech; Spectral reduction; Mahalanobis distance;

D O I：

10.1016/j.csl.2007.07.003

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although speech derived from read texts, news broadcasts, and other similar prepared contexts can be recognized with high accuracy, recognition performance drastically decreases for spontaneous speech. This is due to the fact that spontaneous speech and read speech are significantly different acoustically as well as linguistically. This paper statistically and quantitatively analyzes differences in acoustic features between spontaneous and read speech using two large-scale speech corpora, "Corpus of Spontaneous Japanese (CSJ)'' and "Japanese Newspaper Article Sentences (JNAS)''. Experimental results show that spontaneous speech can be characterized by reduced spectral space in comparison with that of read speech, and that the more spontaneous, the more the spectral space shrinks. This paper also clarifies that reduction in the spectral space leads to reduction in phoneme recognition accuracy. This result indicates that spectral reduction is one major reason for the decrease of recognition accuracy in spontaneous speech. (C) 2007 Published by Elsevier Ltd.

引用

页码：171 / 184

页数：14

共 50 条

[1] DIFFERENCES BETWEEN READ AND SPONTANEOUS SPEECH OF DEAF-CHILDREN
SMITH, CR
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1982, 72 (04): : 1304 - 1305
[2] Prosody for Mandarin Speech Recognition: a Comparative Study of Read and Spontaneous Speech
Yeung, Yu Ting
Qian, Yao
Lee, Tan
Soong, Frank K.
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1133 - +
[3] DIALECT IDENTIFICATION: IMPACT OF DIFFERENCES BETWEEN READ VERSUS SPONTANEOUS SPEECH
Liu, Gang
Lei, Yun
Hansen, John H. L.
18TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2010), 2010, : 2003 - 2006
[4] DETECTING DEPRESSION: A COMPARISON BETWEEN SPONTANEOUS AND READ SPEECH
Alghowinem, Sharifa
Goecke, Roland
Wagner, Michael
Epps, Julien
Breakspear, Michael
Parker, Gordon
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7547 - 7551
[5] Using Syllables as Acoustic Units for Spontaneous Speech Recognition
Hejtmanek, Jan
TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 299 - 305
[6] Improving Acoustic Models for Russian Spontaneous Speech Recognition
Prudnikov, Alexey
Medennikov, Ivan
Mendelev, Valentin
Korenevsky, Maxim
Khokhlov, Yuri
SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 234 - 242
[7] Important prosody characteristics for spontaneous speech recognition
Kleckova, J
Krutisova, J
Matousek, V
Schwarz, J
ICONIP'02: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING: COMPUTATIONAL INTELLIGENCE FOR THE E-AGE, 2002, : 717 - 721
[8] Reconsidering Read and Spontaneous Speech: Causal Perspectives on the Generation of Training Data for Automatic Speech Recognition
Gabler, Philipp
Geiger, Bernhard C.
Schuppler, Barbara
Kern, Roman
INFORMATION, 2023, 14 (02)
[9] Difference of acoustic modeling for read speech and dialogue speech
Mimura, M.
Kawahara, T.
Acoustical Science and Technology, 2001, 22 (05) : 373 - 374
[10] THE RECOGNITION OF WORDS AFTER THEIR ACOUSTIC OFFSETS IN SPONTANEOUS SPEECH - EFFECTS OF SUBSEQUENT CONTEXT
BARD, EG
SHILLCOCK, RC
ALTMANN, GTM
PERCEPTION & PSYCHOPHYSICS, 1988, 44 (05): : 395 - 408

← 1 2 3 4 5 →