Differences between acoustic characteristics of spontaneous and read speech and their effects on speech recognition performance

被引:53
|
作者
Nakamura, Masanobu [1 ]
Iwano, Koji [1 ]
Furui, Sadaoki [1 ]
机构
[1] Tokyo Inst Technol, Dept Comp Sci, Meguro Ku, Tokyo 1528552, Japan
来源
COMPUTER SPEECH AND LANGUAGE | 2008年 / 22卷 / 02期
关键词
Corpus of Spontaneous Japanese; Spontaneous speech; Spectral reduction; Mahalanobis distance;
D O I
10.1016/j.csl.2007.07.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although speech derived from read texts, news broadcasts, and other similar prepared contexts can be recognized with high accuracy, recognition performance drastically decreases for spontaneous speech. This is due to the fact that spontaneous speech and read speech are significantly different acoustically as well as linguistically. This paper statistically and quantitatively analyzes differences in acoustic features between spontaneous and read speech using two large-scale speech corpora, "Corpus of Spontaneous Japanese (CSJ)'' and "Japanese Newspaper Article Sentences (JNAS)''. Experimental results show that spontaneous speech can be characterized by reduced spectral space in comparison with that of read speech, and that the more spontaneous, the more the spectral space shrinks. This paper also clarifies that reduction in the spectral space leads to reduction in phoneme recognition accuracy. This result indicates that spectral reduction is one major reason for the decrease of recognition accuracy in spontaneous speech. (C) 2007 Published by Elsevier Ltd.
引用
收藏
页码:171 / 184
页数:14
相关论文
共 50 条
  • [1] DIFFERENCES BETWEEN READ AND SPONTANEOUS SPEECH OF DEAF-CHILDREN
    SMITH, CR
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1982, 72 (04): : 1304 - 1305
  • [2] Prosody for Mandarin Speech Recognition: a Comparative Study of Read and Spontaneous Speech
    Yeung, Yu Ting
    Qian, Yao
    Lee, Tan
    Soong, Frank K.
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1133 - +
  • [3] DIALECT IDENTIFICATION: IMPACT OF DIFFERENCES BETWEEN READ VERSUS SPONTANEOUS SPEECH
    Liu, Gang
    Lei, Yun
    Hansen, John H. L.
    18TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2010), 2010, : 2003 - 2006
  • [4] DETECTING DEPRESSION: A COMPARISON BETWEEN SPONTANEOUS AND READ SPEECH
    Alghowinem, Sharifa
    Goecke, Roland
    Wagner, Michael
    Epps, Julien
    Breakspear, Michael
    Parker, Gordon
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7547 - 7551
  • [5] Using Syllables as Acoustic Units for Spontaneous Speech Recognition
    Hejtmanek, Jan
    TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 299 - 305
  • [6] Improving Acoustic Models for Russian Spontaneous Speech Recognition
    Prudnikov, Alexey
    Medennikov, Ivan
    Mendelev, Valentin
    Korenevsky, Maxim
    Khokhlov, Yuri
    SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 234 - 242
  • [7] Important prosody characteristics for spontaneous speech recognition
    Kleckova, J
    Krutisova, J
    Matousek, V
    Schwarz, J
    ICONIP'02: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING: COMPUTATIONAL INTELLIGENCE FOR THE E-AGE, 2002, : 717 - 721
  • [8] Reconsidering Read and Spontaneous Speech: Causal Perspectives on the Generation of Training Data for Automatic Speech Recognition
    Gabler, Philipp
    Geiger, Bernhard C.
    Schuppler, Barbara
    Kern, Roman
    INFORMATION, 2023, 14 (02)
  • [9] Difference of acoustic modeling for read speech and dialogue speech
    Mimura, M.
    Kawahara, T.
    Acoustical Science and Technology, 2001, 22 (05) : 373 - 374
  • [10] THE RECOGNITION OF WORDS AFTER THEIR ACOUSTIC OFFSETS IN SPONTANEOUS SPEECH - EFFECTS OF SUBSEQUENT CONTEXT
    BARD, EG
    SHILLCOCK, RC
    ALTMANN, GTM
    PERCEPTION & PSYCHOPHYSICS, 1988, 44 (05): : 395 - 408