Wide context acoustic modeling in read vs. spontaneous speech

被引：0

作者：

Finke, M

Rogina, I

机构：

来源：

1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS | 1997年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Context-dependent acoustic models have been applied in speech recognition research for many years, and have been shown to increase the recognition accuracy significantly. The most common approach is to use triphones. Recently, several speech recognition groups have started investigating the use of larger phonetic context windows when building acoustic models. In this paper we discuss some of the computational problems arising from wide context modeling (polyphonic modeling) and present methods to cope with these problems. A two stage decision tree based polyphonic clustering approach is described which implements a. more flexible parameter tying scheme. The new clustering approach gave us significant improvement across all tasks - WSJ, SWB, and Spontaneous Scheduling Task - and across all languages involved (German, Spanish, English). We report recognition results based on the JANUS speech recognition toolkit [2, 8] on two tasks comparing acoustic context phenomena in English read versus spontaneous speech. We used our WSJ 60K recognizer and the JANUS SWB 10K polyphonic recognizer.

引用

页码：1743 / 1746

页数：4

共 50 条

[1] Difference of acoustic modeling for read speech and dialogue speech
Mimura, M.
Kawahara, T.
Acoustical Science and Technology, 2001, 22 (05) : 373 - 374
[2] Modeling prosody for language identification on read and spontaneous speech
Rouas, JL
Farinas, J
Pellegrino, F
André-Obrecht, R
2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 753 - 756
[3] Modeling prosody for language identification on read and spontaneous speech
Rouas, JL
Farinas, J
Pellegrino, F
André-Obrecht, R
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 40 - 43
[4] Differences between acoustic characteristics of spontaneous and read speech and their effects on speech recognition performance
Nakamura, Masanobu
Iwano, Koji
Furui, Sadaoki
COMPUTER SPEECH AND LANGUAGE, 2008, 22 (02): : 171 - 184
[5] Pausing preceding and following that in that-clauses of Obama's G-20 Summit Speech in London: read vs. spontaneous speech
Ozkan, Yonca
Genc, Bilal
Bada, Erdogan
3L-LANGUAGE LINGUISTICS LITERATURE-THE SOUTHEAST ASIAN JOURNAL OF ENGLISH LANGUAGE STUDIES, 2010, 16 (02): : 47 - 65
[6] Acoustic and Language Modeling for Children's Read Speech Assessment
Tulsiani, Hitesh
Swarup, Prakhar
Rao, Preeti
2017 TWENTY-THIRD NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2017,
[7] Automatic Anomaly Detection for Dysarthria Across Two Speech Styles: Read vs Spontaneous Speech
Laaridh, Imed
Fredouille, Corinne
Meunier, Christine
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 1998 - 2004
[8] Deep vs. Wide: Depth on a Budget for Robust Speech Recognition
Vinyals, Oriol
Morgan, Nelson
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 114 - 118
[9] Syllable detection in read and spontaneous speech
Pfitzinger, HR
Burger, S
Heid, S
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1261 - 1264
[10] Towards understanding spontaneous speech: Word accuracy vs. concept accuracy
Boros, M
Eckert, W
Gallwitz, F
Gora, G
Hanrieder, G
Niemann, H
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1009 - 1012

← 1 2 3 4 5 →