AnnIE: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark

被引:0
|
作者
Friedrich, Niklas [1 ]
Gashteovski, Kiril [2 ]
Yu, Mingying [1 ,2 ]
Kotnis, Bhushan [2 ]
Lawrence, Carolin [2 ]
Niepert, Mathias [2 ,3 ]
Glavas, Goran [1 ,4 ]
机构
[1] Univ Mannheim, Mannheim, Germany
[2] NEC Labs Europe, Heidelberg, Germany
[3] Univ Stuttgart, Stuttgart, Germany
[4] Ludwig Maximilians Univ Munchen, Munich, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Open Information Extraction (OIE) is the task of extracting facts from sentences in the form of relations and their corresponding arguments in schema-free manner. Intrinsic performance of OIE systems is difficult to measure due to the incompleteness of existing OIE benchmarks: ground truth extractions do not group all acceptable surface realizations of the same fact that can be extracted from a sentence. To measure performance of OIE systems more realistically, it is necessary to manually annotate complete facts (i.e., clusters of all acceptable surface realizations of the same fact) from input sentences. We propose AnnIE: an interactive annotation platform that facilitates such challenging annotation tasks and supports creation of complete fact-oriented OIE evaluation benchmarks. AnnIE is modular and flexible in order to support different use case scenarios (i.e., benchmarks covering different types of facts) and different languages. We use AnnIE to build two complete OIE benchmarks: one with verb-mediated facts and another with facts encompassing named entities. We evaluate several OIE systems on our complete benchmarks created with AnnIE. We publicly release AnnIE under non-restrictive license.(1)
引用
收藏
页码:44 / 60
页数:17
相关论文
共 50 条
  • [21] Abstractive Open Information Extraction
    Pei, Kevin
    Jindal, Ishan
    Chang, Kevin Chen-Chuan
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6146 - 6158
  • [22] Open Information Extraction usingWikipedia
    Wu, Fei
    Weld, Daniel S.
    ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 118 - 127
  • [23] LANN: an Integrated Online Annotation Tool for Information Extraction
    Wang, Jingqi
    Zhang, Yaoyun
    Lin, Bin
    Pham, Huy Anh
    He, Long
    Du, Jingcheng
    Manion, Frank
    2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 461 - 464
  • [24] Constructing of marine information management platform based on ESB
    College of Information, Shanghai Ocean University, Shanghai, China
    IITA Int. Conf. Geosci. Remote Sens., IITA-GRS, (266-269):
  • [25] Distributed Platform for the Extraction and Analysis of Information
    Pinto-Santos, Francisco
    Shoeibi, Niloufar
    Rivas, Alberto
    Hernandez, Guillermo
    Chamoso, Pablo
    de la Prieta, Fernando
    SUSTAINABLE SMART CITIES AND TERRITORIES, 2022, 253 : 200 - 210
  • [26] Using Incomplete Information for Complete Weight Annotation of Road Networks
    Yang, Bin
    Kaul, Manohar
    Jensen, Christian S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (05) : 1267 - 1279
  • [27] Global Information Interoperability through Open Information Platform
    Guo, Jingzhi
    PROCEEDINGS OF 2013 IEEE 4TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2012, : 631 - 638
  • [28] System constructing idea based on open database platform and an example
    He, Zhaohong
    Liu, Ying
    Huabei Gongxueyuan Xuebao/Journal of North China Institute of Technology, 1998, 19 (04): : 294 - 297
  • [29] Callico: A Versatile Open-Source Document Image Annotation Platform
    Kermorvant, Christopher
    Bardou, Eva
    Blanco, Manon
    Abadie, Bastien
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT III, 2024, 14806 : 338 - 353
  • [30] Extraction of Product Names for Constructing a Database of Souvenir Information
    Nagao, Noriyuki
    Ando, Kazuaki
    FIFTH INTERNATIONAL CONFERENCE ON INFORMATICS AND APPLICATIONS (ICIA2016), 2016, : 88 - 96