Preparing a Corpus of Dutch Spontaneous Dialogues for Automatic Phonetic Analysis

被引:0
|
作者
Schuppler, Barbara [1 ]
Ernestus, Mirjam [1 ]
Scharenborg, Odette [1 ]
Boves, Lou [1 ]
机构
[1] Radboud Univ Nijmegen, Ctr Language & Speech Technol, Nijmegen, Netherlands
关键词
corpus creation; conversational speech; spontaneous dialogues; reductions; pronunciation variants; automatic phonemic transcription;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents the steps needed to make a corpus of Dutch spontaneous dialogues accessible for automatic phonetic research aimed at increasing our understanding of reduction phenomena and the role of fine phonetic detail. Since the corpus was not created with automatic processing in mind, it needed to be reshaped. The first part of this paper describes the actions needed for this reshaping in some detail. The second part reports the results of a preliminary analysis of the reduction phenomena in the corpus. For this purpose a phonemic transcription of the corpus was created by means of a forced alignment, first with a lexicon of canonical pronunciations and then with multiple pronunciation variants per word. In this study pronunciation variants were generated by applying a large set of phonetic processes that have been implicated in reduction to the canonical pronunciations of the words. This relatively straightforward procedure allows us to produce plausible pronunciation variants and to verify and extend the results of previous reduction studies reported in the literature.
引用
收藏
页码:1638 / 1641
页数:4
相关论文
共 50 条
  • [1] The Harmonia Corpus - A Dialogue Corpus for Automatic Analysis of Phonetic Convergence
    Bachan, Jolanta
    Owsianny, Mariusz
    Demenko, Grazyna
    HUMAN LANGUAGE TECHNOLOGY. CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, LTC 2017, 2020, 12598 : 149 - 163
  • [2] PRAGMATIC ANALYSIS OF A TASK-BASED CORPUS OF SPONTANEOUS SPEECH AND TEACHING DIALOGUES
    Castagneto, Marina
    Ferrari, Stefania
    ITALIANO LINGUADUE, 2023, 15 (02) : 42 - 42
  • [3] Phonetic analysis and automatic prediction of vowel duration in Hungarian spontaneous speech
    Beke, Andras
    Gosy, Maria
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2014, 8 (04): : 301 - 314
  • [4] Active Annotation in the LUNA Italian Corpus of Spontaneous Dialogues
    Raymond, Christian
    Rodriguez, Kepa Joseba
    Riccardi, Giuseppe
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1949 - 1955
  • [5] DiaBLa: a corpus of bilingual spontaneous written dialogues for machine translation
    Rachel Bawden
    Eric Bilinski
    Thomas Lavergne
    Sophie Rosset
    Language Resources and Evaluation, 2021, 55 : 635 - 660
  • [6] DiaBLa: a corpus of bilingual spontaneous written dialogues for machine translation
    Bawden, Rachel
    Bilinski, Eric
    Lavergne, Thomas
    Rosset, Sophie
    LANGUAGE RESOURCES AND EVALUATION, 2021, 55 (03) : 635 - 660
  • [7] CaSiNo: A Corpus of Campsite Negotiation Dialogues for Automatic Negotiation Systems
    Chawla, Kushal
    Ramirez, Jaysa
    Clever, Rene
    Lucas, Gale
    May, Jonathan
    Gratch, Jonathan
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3167 - 3185
  • [8] EmoTwiCS: a corpus for modelling emotion trajectories in Dutch customer service dialogues on Twitter
    Labat, Sofie
    Demeester, Thomas
    Hoste, Veronique
    LANGUAGE RESOURCES AND EVALUATION, 2024, 58 (02) : 505 - 546
  • [9] Measuring and Comparing Vowel Qualities in a Dutch Spontaneous Speech Corpus
    Jacobi, Irene
    Pols, Louis C. W.
    Stroop, Jan
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 701 - 704
  • [10] Methods and tools for the phonetic analysis of the major oral corpus
    Crouzet, Olivier
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2013, 54 (02): : 140 - 142