CoRuSS- a New Prosodically Annotated Corpus of Russian Spontaneous Speech

被引:0
|
作者
Kachkovskaia, Tatiana [1 ]
Kocharov, Daniil [1 ]
Skrelin, Pavel [1 ]
Volskaya, Nina [1 ]
机构
[1] St Petersburg State Univ, Dept Phonet, 7-9 Univ Skaya Nab, St Petersburg 199034, Russia
基金
俄罗斯科学基金会;
关键词
speech corpus; speech annotation; Russian;
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
This paper describes speech data recording, processing and annotation of a new speech corpus CoRuSS (Corpus of Russian Spontaneous Speech), which is based on connected communicative speech recorded from 60 native Russian male and female speakers of different age groups (from 16 to 77). Some Russian speech corpora available at the moment contain plain orthographic texts and provide some kind of limited annotation, but there are no corpora providing detailed prosodic annotation of spontaneous conversational speech. This corpus contains 30 hours of high quality recorded spontaneous Russian speech, half of it has been transcribed and prosodically labeled. The recordings consist of dialogues between two speakers, monologues (speakers' self-presentations) and reading of a short phonetically balanced text. Since the corpus is labeled for a wide range of linguistic-phonetic and prosodic-information, it provides basis for empirical studies of various spontaneous speech phenomena as well as for comparison with those we observe in prepared read speech. Since the corpus is designed as a open-access resource of speech data, it will also make possible to advance corpus-based analysis of spontaneous speech data across languages and speech technology development as well.
引用
收藏
页码:1949 / 1954
页数:6
相关论文
共 50 条
  • [1] A Fully Annotated Corpus of Russian Speech
    Skrelin, Pavel
    Volskaya, Nina
    Kocharov, Daniil
    Evgrafova, Karina
    Glotova, Olga
    Evdokimova, Vera
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 109 - 112
  • [2] A Danish phonetically annotated spontaneous speech corpus (DanPASS)
    Gronnum, Nina
    [J]. SPEECH COMMUNICATION, 2009, 51 (07) : 594 - 603
  • [3] An Experiment in Paratone Detection in a Prosodically Annotated EAP Spoken Corpus
    Meli, Adrien
    Ballier, Nicolas
    Falaise, Achille
    Henderson, Alice
    [J]. INTERSPEECH 2021, 2021, : 2616 - 2620
  • [4] Multilevel Annotation in the Corpus for Parsing Russian Spontaneous Speech
    Kovriguina, Liubov
    Shilin, Ivan
    Putintseva, Alina
    Shipilo, Alexander
    [J]. SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 311 - 320
  • [5] Digitisation and automatic alignment of the DIALOG corpus:: A prosodically annotated corpus of Czech television debates
    Peterek, Nino
    Kaderka, Petr
    Svobodova, Zdenka
    Havlova, Eva
    Havlik, Martin
    Klimova, Jana
    Kubackova, Patricie
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 607 - +
  • [6] An Annotated Corpus of Direct Speech
    Lee, John
    Yeung, Chak Yan
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 1059 - 1063
  • [7] On building phonetically and prosodically rich speech corpus for text-to-speech synthesis
    Matousek, Jindrich
    Romportl, Jan
    [J]. PROCEEDINGS OF THE SECOND IASTED INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2006, : 442 - +
  • [8] RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus
    Boros, Tiberiu
    Stan, Adriana
    Watts, Oliver
    Dumitrescu, Stefan Daniel
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [9] Sense-Annotated Corpus for Russian
    Kirillovich, Alexander
    Loukachevitch, Natalia
    Kulaev, Maksim
    Bolshina, Angelina
    Ilvovsky, Dmitry
    [J]. PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE COMPUTATIONAL LINGUISTICS IN BULGARIA, CLIB 2022, 2022, : 130 - 136
  • [10] Development of Kannada Speech Corpus for Prosodically Guided Phonetic Search Engine
    Shridhara, M., V
    Banahatti, Bapu K.
    Narthan, L.
    Karjigi, Veena
    Kumaraswamy, R.
    [J]. 2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,