Investigating Entity Linking in Early English Legal Documents

被引:6
|
作者
Munnelly, Gary [1 ]
Lawless, Seamus [1 ]
机构
[1] Adapt Ctr, Dublin, Ireland
基金
爱尔兰科学基金会;
关键词
Named Entity Disambiguation; Digital Humanities; Cultural Heritage;
D O I
10.1145/3197026.3197055
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we investigate the accuracy and overall suitability of a variety of Entity Linking systems for the task of disambiguating entities in 17th century depositions obtained during the 1641 Irish Rebellion. The depositions are extremely difficult for modern NLP tools to work with due to inconsistent spelling, use of language and archaic references. In order to assess the severity of difficulty faced by Entity Linking systems when working with these documents we use the depositions to create an evaluation corpus. This corpus is used as an input to the General Entity Annotator Benchmarking Framework, a standard benchmarking platform for entity annotation systems. Based on this corpus and the results obtained from the General Entity Annotator Benchmarking Framework we observe that the accuracy of existing Entity Linking systems is limited when applied to content like these depositions. This is due to a number of issues ranging from problems with existing state-of-the-art systems to poor representation of historic entities in modern knowledge bases. We discuss some interesting questions raised by this evaluation and put forward a plan for future work in order to learn more.
引用
收藏
页码:59 / 67
页数:9
相关论文
共 50 条
  • [1] Deep Learning for Named-Entity Linking with Transfer Learning for Legal Documents
    Elnaggar, Ahmed
    Otto, Robin
    Matthes, Florian
    [J]. PROCEEDINGS OF 2018 ARTIFICIAL INTELLIGENCE AND CLOUD COMPUTING CONFERENCE (AICCC 2018), 2018, : 23 - 28
  • [2] Characteristics of Legal English Documents and Their Affection to Legal English Translation
    Ni, Yuping
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON LAW, LANGUAGE AND DISCOURSE: MULTICULTURALISM, MULTIMODALITY AND MULTIDIMENSIONALITY, 2012, : 129 - 135
  • [3] Entity Linking for Mathematical Expressions in Scientific Documents
    Kristianto, Giovanni Yoko
    Topic, Goran
    Aizawa, Akiko
    [J]. DIGITAL LIBRARIES: KNOWLEDGE, INFORMATION, AND DATA IN AN OPEN ACCESS SOCIETY, 2016, 10075 : 144 - 149
  • [4] Evaluating Tabular and Textual Entity Linking in Financial Documents
    Nararatwong, Rungsiman
    Kertkeidkachorn, Natthawut
    Ichise, Ryutaro
    [J]. 18TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC 2024, 2024, : 130 - 133
  • [5] Boosting Entity Linking Performance by Leveraging Unlabeled Documents
    Phong Le
    Titov, Ivan
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1935 - 1945
  • [6] Legal Entity Extraction: An Experimental Study of NER Approach for Legal Documents
    Naik, Varsha
    Patel, Purvang
    Kannan, Rajeswari
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (03) : 775 - 783
  • [7] Entity linking for English and other languages: a survey
    Guellil, Imane
    Garcia-Dominguez, Antonio
    Lewis, Peter R.
    Hussain, Shakeel
    Smith, Geoffrey
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (07) : 3773 - 3824
  • [8] A Dataset of German Legal Documents for Named Entity Recognition
    Leitner, Elena
    Rehm, Georg
    Moreno-Schneider, Julian
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4478 - 4485
  • [9] Survey on English Entity Linking on Wikidata: Datasets and approaches
    Moeller, Cedric
    Lehmann, Jens
    Usbeck, Ricardo
    [J]. SEMANTIC WEB, 2022, 13 (06) : 925 - 966
  • [10] Early Stage Sparse Retrieval with Entity Linking
    Shehata, Dahlia
    Arabzadeh, Negar
    Clarke, Charles L. A.
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4464 - 4469