AnnIE: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark

被引:0
|
作者
Friedrich, Niklas [1 ]
Gashteovski, Kiril [2 ]
Yu, Mingying [1 ,2 ]
Kotnis, Bhushan [2 ]
Lawrence, Carolin [2 ]
Niepert, Mathias [2 ,3 ]
Glavas, Goran [1 ,4 ]
机构
[1] Univ Mannheim, Mannheim, Germany
[2] NEC Labs Europe, Heidelberg, Germany
[3] Univ Stuttgart, Stuttgart, Germany
[4] Ludwig Maximilians Univ Munchen, Munich, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Open Information Extraction (OIE) is the task of extracting facts from sentences in the form of relations and their corresponding arguments in schema-free manner. Intrinsic performance of OIE systems is difficult to measure due to the incompleteness of existing OIE benchmarks: ground truth extractions do not group all acceptable surface realizations of the same fact that can be extracted from a sentence. To measure performance of OIE systems more realistically, it is necessary to manually annotate complete facts (i.e., clusters of all acceptable surface realizations of the same fact) from input sentences. We propose AnnIE: an interactive annotation platform that facilitates such challenging annotation tasks and supports creation of complete fact-oriented OIE evaluation benchmarks. AnnIE is modular and flexible in order to support different use case scenarios (i.e., benchmarks covering different types of facts) and different languages. We use AnnIE to build two complete OIE benchmarks: one with verb-mediated facts and another with facts encompassing named entities. We evaluate several OIE systems on our complete benchmarks created with AnnIE. We publicly release AnnIE under non-restrictive license.(1)
引用
收藏
页码:44 / 60
页数:17
相关论文
共 50 条
  • [31] Learning to Filter Documents for Information Extraction using Rapid Annotation
    Aguirre, Carlos A.
    Gullapalli, Sneha
    De La Torre, Maria F.
    Lam, Alice
    Weese, Joshua Levi
    Hsu, William H.
    2017 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA SCIENCE (MLDS 2017), 2017, : 85 - 90
  • [32] Fused adjacency matrices to enhance information extraction: The beer benchmark
    Cavallini, Nicola
    Savorani, Francesco
    Bro, Rasmus
    Cocchi, Marina
    ANALYTICA CHIMICA ACTA, 2019, 1061 : 70 - 83
  • [33] Open Information Extraction from the Web
    Banko, Michele
    Cafarella, Michael J.
    Soderland, Stephen
    Broadhead, Matt
    Etzioni, Oren
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2670 - 2676
  • [34] Semi-Open Information Extraction
    Yu, Bowen
    Zhang, Zhenyu
    Sheng, Jiawei
    Liu, Tingwen
    Wang, Yubin
    Wang, Yucheng
    Wang, Bin
    PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 1661 - 1672
  • [35] Open Information Extraction via Chunks
    Dong, Kuicai
    Sun, Aixin
    Kim, Jung-Jae
    Li, Xiaoli
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15390 - 15404
  • [36] Open Information Extraction from the Web
    Etzioni, Oren
    Banko, Michele
    Soderland, Stephen
    Weld, Daniel S.
    COMMUNICATIONS OF THE ACM, 2008, 51 (12) : 68 - 74
  • [37] Survey of Open Information Extraction Research
    Hu, Hangle
    Cheng, Chunlei
    Ye, Qing
    Peng, Lin
    Shen, Youzhi
    Computer Engineering and Applications, 2023, 59 (16): : 31 - 49
  • [38] Open Information Extraction for Italian Sentences
    Damiano, Emanuele
    Minutolo, Aniello
    Esposito, Massimo
    2018 32ND INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (WAINA), 2018, : 668 - 673
  • [39] Constructing Digital Library Information Platform Based On Cloud Computing
    Wang, Yubin
    bo, Jingyi
    Xu, Weili
    INTERNATIONAL JOURNAL OF FUTURE GENERATION COMMUNICATION AND NETWORKING, 2014, 7 (03): : 117 - 128
  • [40] Information extraction for polish using the SProUT platform
    Piskorski, J
    Homola, P
    Marciniak, M
    Mykowiecka, A
    Przepiórkowski, A
    Wolinski, M
    INTELLIGENT INFORMATION PROCESSING AND WEB MINING, 2004, : 227 - 236