An annotated corpus for the analysis of VP ellipsis

被引:15
|
作者
Bos, Johan [1 ]
Spenader, Jennifer [1 ]
机构
[1] Univ Groningen, Groningen, Netherlands
关键词
Ellipsis; Annotation; Evaluation; VP ellipsis;
D O I
10.1007/s10579-011-9142-3
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Verb Phrase Ellipsis (VPE) has been studied in great depth in theoretical linguistics, but empirical studies of VPE are rare. We extend the few previous corpus studies with an annotated corpus of VPE in all 25 sections of the Wall Street Journal corpus (WSJ) distributed with the Penn Treebank. We annotated the raw files using a stand-off annotation scheme that codes the auxiliary verb triggering the elided verb phrase, the start and end of the antecedent, the syntactic type of antecedent (VP, TV, NP, PP or AP), and the type of syntactic pattern between the source and target clauses of the VPE and its antecedent. We found 487 instances of VPE (including predicative ellipsis, antecedent-contained deletion, comparative constructions, and pseudo-gapping) plus 67 cases of related phenomena such as do so anaphora. Inter-annotator agreement was high, with a 0.97 average F-score for three annotators for one section of the WSJ. Our annotation is theory neutral, and has better coverage than earlier efforts that relied on automatic methods, e.g. simply searching the parsed version of the Penn Treebank for empty VP's achieves a high precision (0.95) but low recall (0.58) when compared with our manual annotation. The distribution of VPE source-target patterns deviates highly from the standard examples found in the theoretical linguistics literature on VPE, once more underlining the value of corpus studies. The resulting corpus will be useful for studying VPE phenomena as well as for evaluating natural language processing systems equipped with ellipsis resolution algorithms, and we propose evaluation measures for VPE detection and VPE antecedent selection. The stand-off annotation is freely available for research purposes.
引用
收藏
页码:463 / 494
页数:32
相关论文
共 50 条
  • [1] An annotated corpus for the analysis of VP ellipsis
    Johan Bos
    Jennifer Spenader
    [J]. Language Resources and Evaluation, 2011, 45 : 463 - 494
  • [2] NoEl: An Annotated Corpus for Noun Ellipsis in English
    Khullar, Payal
    Majmundar, Kushal
    Shrivastava, Manish
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 34 - 43
  • [3] Generation of VP ellipsis: A corpus-based approach
    Hardt, D
    Rambow, O
    [J]. 39TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2001, : 282 - 289
  • [4] Ellipsis and discourse coherence (VP ellipsis)
    Frazier, Lyn
    Clifton, Charles, Jr.
    [J]. LINGUISTICS AND PHILOSOPHY, 2006, 29 (03) : 315 - 346
  • [5] Focus and VP ellipsis
    Frazier, Lyn
    Clifton, Charles, Jr.
    Carlson, Katy
    [J]. LANGUAGE AND SPEECH, 2007, 50 : 1 - 21
  • [6] Missing objects in Hebrew: Argument ellipsis, not VP ellipsis
    Landau, Idan
    [J]. GLOSSA-A JOURNAL OF GENERAL LINGUISTICS, 2018, 3 (01):
  • [7] Gapping Is Not (VP-) Ellipsis
    Johnson, Kyle
    [J]. LINGUISTIC INQUIRY, 2009, 40 (02) : 289 - 328
  • [8] The Acceptability Cline in VP Ellipsis
    Kim, Christina S.
    Kobele, Gregory M.
    Runner, Jeffrey T.
    Hale, John T.
    [J]. SYNTAX-A JOURNAL OF THEORETICAL EXPERIMENTAL AND INTERDISCIPLINARY RESEARCH, 2011, 14 (04): : 318 - 354
  • [9] An empirical approach to VP ellipsis
    Hardt, D
    [J]. COMPUTATIONAL LINGUISTICS, 1997, 23 (04) : 525 - 541
  • [10] To be or not to be elided: VP ellipsis revisited
    Aelbrecht, Lobke
    Harwood, William
    [J]. LINGUA, 2015, 153 : 66 - 97