AnnIE: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark

被引：0

作者：

Friedrich, Niklas ^{[1
]}

Gashteovski, Kiril ^{[2
]}

Yu, Mingying ^{[1
,2
]}

Kotnis, Bhushan ^{[2
]}

Lawrence, Carolin ^{[2
]}

Niepert, Mathias ^{[2
,3
]}

Glavas, Goran ^{[1
,4
]}

机构：

[1] Univ Mannheim, Mannheim, Germany

[2] NEC Labs Europe, Heidelberg, Germany

[3] Univ Stuttgart, Stuttgart, Germany

[4] Ludwig Maximilians Univ Munchen, Munich, Germany

来源：

PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): PROCEEDINGS OF SYSTEM DEMONSTRATIONS | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Open Information Extraction (OIE) is the task of extracting facts from sentences in the form of relations and their corresponding arguments in schema-free manner. Intrinsic performance of OIE systems is difficult to measure due to the incompleteness of existing OIE benchmarks: ground truth extractions do not group all acceptable surface realizations of the same fact that can be extracted from a sentence. To measure performance of OIE systems more realistically, it is necessary to manually annotate complete facts (i.e., clusters of all acceptable surface realizations of the same fact) from input sentences. We propose AnnIE: an interactive annotation platform that facilitates such challenging annotation tasks and supports creation of complete fact-oriented OIE evaluation benchmarks. AnnIE is modular and flexible in order to support different use case scenarios (i.e., benchmarks covering different types of facts) and different languages. We use AnnIE to build two complete OIE benchmarks: one with verb-mediated facts and another with facts encompassing named entities. We evaluate several OIE systems on our complete benchmarks created with AnnIE. We publicly release AnnIE under non-restrictive license.(1)

引用

页码：44 / 60

页数：17

共 50 条

[31] Learning to Filter Documents for Information Extraction using Rapid Annotation
Aguirre, Carlos A.
Gullapalli, Sneha
De La Torre, Maria F.
Lam, Alice
Weese, Joshua Levi
Hsu, William H.
2017 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA SCIENCE (MLDS 2017), 2017, : 85 - 90
[32] Fused adjacency matrices to enhance information extraction: The beer benchmark
Cavallini, Nicola
Savorani, Francesco
Bro, Rasmus
Cocchi, Marina
ANALYTICA CHIMICA ACTA, 2019, 1061 : 70 - 83
[33] Open Information Extraction from the Web
Banko, Michele
Cafarella, Michael J.
Soderland, Stephen
Broadhead, Matt
Etzioni, Oren
20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2670 - 2676
[34] Semi-Open Information Extraction
Yu, Bowen
Zhang, Zhenyu
Sheng, Jiawei
Liu, Tingwen
Wang, Yubin
Wang, Yucheng
Wang, Bin
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 1661 - 1672
[35] Open Information Extraction via Chunks
Dong, Kuicai
Sun, Aixin
Kim, Jung-Jae
Li, Xiaoli
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15390 - 15404
[36] Open Information Extraction from the Web
Etzioni, Oren
Banko, Michele
Soderland, Stephen
Weld, Daniel S.
COMMUNICATIONS OF THE ACM, 2008, 51 (12) : 68 - 74
[37] Survey of Open Information Extraction Research
Hu, Hangle
Cheng, Chunlei
Ye, Qing
Peng, Lin
Shen, Youzhi
Computer Engineering and Applications, 2023, 59 (16): : 31 - 49
[38] Open Information Extraction for Italian Sentences
Damiano, Emanuele
Minutolo, Aniello
Esposito, Massimo
2018 32ND INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (WAINA), 2018, : 668 - 673
[39] Constructing Digital Library Information Platform Based On Cloud Computing
Wang, Yubin
bo, Jingyi
Xu, Weili
INTERNATIONAL JOURNAL OF FUTURE GENERATION COMMUNICATION AND NETWORKING, 2014, 7 (03): : 117 - 128
[40] Information extraction for polish using the SProUT platform
Piskorski, J
Homola, P
Marciniak, M
Mykowiecka, A
Przepiórkowski, A
Wolinski, M
INTELLIGENT INFORMATION PROCESSING AND WEB MINING, 2004, : 227 - 236

← 1 2 3 4 5 →