Constructing multi-level speech database for spontaneous speech processing

被引：0

作者：

Hahn, M

Kim, S

Lee, JC

Lee, YJ

机构：

来源：

ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4 | 1996年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper describes a new database, called muti-level speech database, for spontaneous speech processing. We designed the database to cover textual and acoustic variations from declarative speech to spontaneous speech. The database is composed of 5 categories which are, in the order of decreasing spontaneity, spontaneous speech, interview, simulated interview, declarative speech with context, and declarative speech without context. We collected total 112 sets from 23 subjects(male: 19, female: 4). Then the database was firstly transcribed using 15 transcription symbols according to our own transcription rules. Secondly, prosodic information will be added. The goal of this research is a comparative textual and prosodic analysis at each level, quantification of spontaneity of diversified speech database for dialogue speech synthesis and recognition. From the preliminary analysis of transcribed texts, the spontaneous speech has more corrections, repetitions, and pauses than the others as expected. In addition, average number of sentences per turn of spontaneous speech is greater than the others. From the above results, we can quantify the spontaneity of speech database.

引用

页码：1930 / 1933

页数：4

共 50 条

[41] Low bit-rate speech coding with predictive multi-level vector quantization
Yu, Xingye
Li, Ye
Zhang, Peng
Lin, Lingxia
Cai, Tianyu
[J]. Applied Acoustics, 2025, 231
[42] SPEECH EMOTION RECOGNITION WITH CO-ATTENTION BASED MULTI-LEVEL ACOUSTIC INFORMATION
Zou, Heqing
Si, Yuke
Chen, Chen
Rajan, Deepu
Chng, Eng Siong
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7367 - 7371
[43] Prediction of L2 speech proficiency based on multi-level linguistic features
De Fino, Verdiana
Fontan, Lionel
Pinquier, Julien
Ferrane, Isabelle
Detey, Sylvain
[J]. INTERSPEECH 2022, 2022, : 4043 - 4047
[44] Minimal Cross-correlation Criterion for Speech Emotion Multi-level Feature Selection
Liogiene, Tatjana
Tamulevicius, Gintautas
[J]. 2015 OPEN CONFERENCE OF ELECTRICAL, ELECTRONIC AND INFORMATION SCIENCES (ESTREAM), 2015,
[45] A Multi-Level Circulant Cross-Modal Transformer for Multimodal Speech Emotion Recognition
Gong, Peizhu
Liu, Jin
Wu, Zhongdai
Han, Bing
Wang, Y. Ken
He, Huihua
[J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02): : 4203 - 4220
[46] Prediction of L2 speech proficiency based on multi-level linguistic features
IRIT, Université de Toulouse, CNRS, Toulouse INP, UT3, Toulouse, France
不详
不详
[J]. Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH, (4043-4047): : 4043 - 4047
[47] GPR-based Thai speech synthesis using multi-level duration prediction
Moungsri, Decha
Koriyama, Tomoki
Kobayashi, Takao
[J]. SPEECH COMMUNICATION, 2018, 99 : 114 - 123
[48] Multi-level evidence of an allelic hierarchy of USH2A variants; hearing loss, auditory processing and speech/language outcomes
Perrino, P. A.
Nedevska, L.
Reader, R.
Hill, A.
Rendall, A. R.
Mountford, H. S.
Buscarello, A. N.
Lahiri, N.
Saggar, A.
Fitch, R. H.
Newbury, D. F.
[J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2019, 27 : 1218 - 1219
[49] Speech-to-text and speech-to-speech summarization of spontaneous speech
Furui, S
Kikuchi, T
Shinnaka, Y
Hori, C
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (04): : 401 - 408
[50] Fluent Personalized Speech Synthesis with Prosodic Word-Level Spontaneous Speech generation
Huang, Yi-Chin
Wu, Chung-Hsien
Shie, Ming-Ge
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 294 - 298

← 1 2 3 4 5 →