CoRTE: A Corpus of Recognizing Textual Entailment Data Annotated for Coreference and Bridging Relations

被引:0
|
作者
Waseem, Afifah [1 ]
机构
[1] Univ Oxford, Dept Comp Sci, Oxford OX1 3QD, England
来源
关键词
Coreference; Bridging relations; Annotated corpus;
D O I
10.1007/978-3-030-00794-2_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents CoRTE, an English corpus annotated with coreference and bridging relations, where the dataset is taken from the main task of recognizing textual entailment (RTE). Our annotation scheme elaborates existing schemes by introducing subcategories. Each coreference and bridging relation has been assigned a category. CoRTE is a useful resource for researchers working on coreference and bridging resolution, as well as recognizing textual entailment (RTE) task. RTE has its applications in many NLP domains. CoRTE would thus provide contextual information readily available to the NLP systems being developed for domains requiring textual inference and discourse understanding. The paper describes the annotation scheme with examples. We have annotated 340 text-hypothesis pairs, consisting of 24,742 tokens and 8,072 markables.
引用
收藏
页码:115 / 125
页数:11
相关论文
共 4 条
  • [1] FATE: a FrameNet-Annotated corpus for Textual Entailment
    Burchardt, Aljoscha
    Pennacchiotti, Marco
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 539 - 546
  • [2] A Database of Relations between Predicate Argument Structures for Recognizing Textual Entailment and Contradiction
    Matsuyoshi, Suguru
    Murakami, Koji
    Matsumoto, Yuji
    Inui, Kentaro
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON UNIVERSAL COMMUNICATION, 2008, : 366 - 373
  • [3] Performance Impact Caused by Hidden Bias of Training Data for Recognizing Textual Entailment
    Tsuchiya, Masatoshi
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 1506 - 1511
  • [4] PACE Corpus: a multilingual corpus of Polarity-annotated textual data from the domains Automotive and CEllphone
    Haenig, Christian
    Niekler, Andreas
    Wuensch, Carsten
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2219 - 2224