Correcting and Validating Syntactic Dependency in the Spoken French Treebank Rhapsodie

被引:0
|
作者
Bawden, Rachel [1 ,2 ]
Botalla, Marie-Amelie [1 ,2 ]
Gerdes, Kim [2 ,3 ]
Kahane, Sylvain [1 ,2 ]
机构
[1] Univ Paris Ouest Nanterre, Modyco, Paris, France
[2] CNRS, F-75700 Paris, France
[3] Univ Paris 03, LPP, F-75230 Paris 05, France
关键词
dependency; treebank; spoken French;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This article presents the methods, results, and precision of the syntactic annotation process of the Rhapsodie Treebank of spoken French. The Rhapsodie Treebank is an 33,000 word corpus annotated for prosody and syntax, licensed in its entirety under Creative Commons. The syntactic annotation contains two levels: a macro-syntactic level, containing a segmentation into illocutionary units (including discourse markers, parentheses.) and a micro-syntactic level including dependency relations and various paradigmatic structures, called pile constructions, the latter being particularly frequent and diverse in spoken language. The micro-syntactic annotation process, presented in this paper, includes a semi-automatic preparation of the transcription, the application of a syntactic dependency parser, transcoding of the parsing results to the Rhapsodie annotation scheme, manual correction by multiple annotators followed by a validation process, and finally the application of coherence rules that check common errors. The good inter-annotator agreement scores are presented and analyzed in greater detail. The article also includes the list of functions used in the dependency annotation and for the distinction of various pile constructions and presents the ideas underlying these choices.
引用
收藏
页码:2320 / 2325
页数:6
相关论文
共 25 条
  • [1] Rhapsodie: a Prosodic-Syntactic Treebank for Spoken French
    Lacheret, Anne
    Kahane, Sylvain
    Beliao, Julie
    Dister, Anne
    Gerdes, Kim
    Goldman, Jean-Philippe
    Obin, Nicolas
    Pietrandrea, Paola
    Tchobanov, Atanas
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [2] A prosodic and syntactic treebank for spoken French
    Krimou, Fanny
    [J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2019, 60 (03): : 139 - 141
  • [3] The syntactic and prosodic annotation of the Spoken French corpus Rhapsodie
    Lacheret-Dujour, Anne
    Kahane, Sylvain
    Pietrandrea, Paola
    Avanzi, Mathieu
    Victorri, Bernard
    [J]. LANGUE FRANCAISE, 2011, (170): : 61 - +
  • [4] Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank
    Dukes, Kais
    Atwell, Eric
    Sharaf, Abdul-Baquee M.
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1822 - 1827
  • [5] Syntactic Annotation in the I3rab Dependency Treebank
    Halabi, Dana
    Awajan, Arafat
    Fayyoumi, Ebaa
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2021, 18 (3A) : 381 - 392
  • [6] The annotation guidelines of the Latin Dependency Treebank and Index Thomisticus Treebank The treatment of some specific syntactic constructions in Latin
    Bamman, David
    Passarotti, Marco
    Busa, Roberto
    Crane, Gregory
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 71 - 76
  • [7] ALIGNED DEPENDENCY TREEBANK ENGLISH-ROMANIAN-FRENCH
    Maranduc, Catalina
    Perez, Cenel-Augusto
    Balmus, Raluca-Stefana
    [J]. PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE 'LINQUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE', 2015, 2015, : 39 - 51
  • [8] Usage-based prosodic structure modelling in spoken French: Rhapsodie feedback
    Lacheret-Dujour, Anne
    [J]. LANGUE FRANCAISE, 2016, (191): : 67 - +
  • [9] Statistical French dependency parsing: treebank conversion and first results
    Candito, Marie
    Crabbe, Benoit
    Denis, Pascal
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1840 - 1847
  • [10] Syntactic and typological properties of translational language: A comparative description of dependency treebank of academic abstracts
    Liang, Yan
    Sang, Zhonggang
    [J]. LINGUA, 2022, 273