Overview of HIPE-2022: Named Entity Recognition and Linking in Multilingual Historical Documents

被引:5
|
作者
Ehrmann, Maud [1 ]
Romanello, Matteo [2 ]
Najem-Meyer, Sven [1 ]
Doucet, Antoine [3 ]
Clematide, Simon [4 ]
机构
[1] EPFL, Digital Humanities Lab, Vaud, Switzerland
[2] Univ Lausanne, Lausanne, Switzerland
[3] Univ La Rochelle, La Rochelle, France
[4] Univ Zurich, Dept Computat Linguist, Zurich, Switzerland
关键词
Named entity recognition and classification; Entity linking; Historical texts; Information extraction; Digitised newspapers; Digital humanities;
D O I
10.1007/978-3-031-13643-6_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an overview of the second edition of HIPE (Identifying Historical People, Places and other Entities), a shared task on named entity recognition and linking in multilingual historical documents. Following the success of the first CLEF-HIPE-2020 evaluation lab, HIPE-2022 confronts systems with the challenges of dealing with more languages, learning domain-specific entities, and adapting to diverse annotation tag sets. This shared task is part of the ongoing efforts of the natural language processing and digital humanities communities to adapt and develop appropriate technologies to efficiently retrieve and explore information from historical texts. On such material, however, named entity processing techniques face the challenges of domain heterogeneity, input noisiness, dynamics of language, and lack of resources. In this context, the main objective of HIPE-2022, run as an evaluation lab of the CLEF 2022 conference, is to gain new insights into the transferability of named entity processing approaches across languages, time periods, document types, and annotation tag sets. Tasks, corpora, and results of participating teams are presented.
引用
收藏
页码:423 / 446
页数:24
相关论文
共 50 条
  • [1] Introducing the HIPE 2022 Shared Task: Named Entity Recognition and Linking in Multilingual Historical Documents
    Ehrmann, Maud
    Romanello, Matteo
    Doucet, Antoine
    Clematide, Simon
    ADVANCES IN INFORMATION RETRIEVAL, PT II, 2022, 13186 : 347 - 354
  • [2] A Multilingual Dataset for Named Entity Recognition, Entity Linking and Stance Detection in Historical Newspapers
    Hamdi, Ahmed
    Pontes, Elvys Linhares
    Boros, Emanuela
    Thi Tuyet Hai Nguyen
    Hackl, Guenter
    Moreno, Jose G.
    Doucet, Antoine
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2328 - 2334
  • [3] Named Entity Recognition and Classification in Historical Documents: A Survey
    Ehrmann, Maud
    Hamdi, Ahmed
    Pontes, Elvys Linhares
    Romanello, Matteo
    Doucet, Antoine
    ACM COMPUTING SURVEYS, 2024, 56 (02)
  • [4] Multilingual Transformers for Named Entity Recognition
    Viksna, Rinalds
    Skadin, Inguna
    BALTIC JOURNAL OF MODERN COMPUTING, 2022, 10 (03): : 457 - 469
  • [5] An Overview of Named Entity Recognition
    Sun, Peng
    Yang, Xuezhen
    Zhao, Xiaobing
    Wang, Zhijuan
    2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 273 - 278
  • [6] Named entity recognition in Vietnamese documents
    Tri Tran, Q.
    Thao Pham, T.X.
    Hung Ngo, Q.
    Dinh, Dien
    Collier, Nigel
    Progress in Informatics, 2007, (04): : 5 - 13
  • [7] Language Clustering for Multilingual Named Entity Recognition
    Shaffer, Kyle
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 40 - 45
  • [8] Joint Learning of Named Entity Recognition and Entity Linking
    Martins, Pedro Henrique
    Marinho, Zita
    Martins, Andre F. T.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 190 - 196
  • [9] Named Entity Recognition for Tamil Biomedical Documents
    Antony, Betina J.
    Mahalakshmi, G. S.
    2014 IEEE INTERNATIONAL CONFERENCE ON CIRCUIT, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2014), 2014, : 1571 - 1577
  • [10] Arabic named entity recognition in crime documents
    Asharef, M.
    Omar, N.
    Albared, M.
    Journal of Theoretical and Applied Information Technology, 2012, 44 (01) : 1 - 6