Annotating Entities with Fine-Grained Types in Austrian Court Decisions

被引:1
|
作者
Revenko, Artem [1 ]
Breit, Anna [1 ]
Mireles, Victor [1 ]
Moreno-Schneider, Julian [2 ]
Sageder, Christian [3 ]
Karampatakisi, Sotirios [1 ]
机构
[1] Semant Web Co GmbH, Vienna, Austria
[2] DFKI GmbH, Kaiserslautern, Germany
[3] Cybly GmbH, Salzburg, Austria
来源
关键词
Named Entity Recognition; Entity Typing; Legal Corpus; Natural Language Processing;
D O I
10.3233/SSW210041
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The usage of Named Entity Recognition tools on domain-specific corpora is often hampered by insufficient training data. We investigate an approach to produce fine-grained named entity annotations of a large corpus of Austrian court decisions from a small manually annotated training data set. We apply a general purpose Named Entity Recognition model to produce annotations of common coarse-grained types. Next, a small sample of these annotations are manually inspected by domain experts to produce an initial fine-grained training data set. To efficiently use the small manually annotated data set we formulate the task of named entity typing as a binary classification task - for each originally annotated occurrence of an entity, and for each fine-grained type we verify if the entity belongs to it. For this purpose we train a transformer-based classifier. We randomly sample 547 predictions and evaluate them manually. The incorrect predictions are used to improve the performance of the classifier - the corrected annotations are added to the training set. The experiments show that re-training with even a very small number (5 or 10) of originally incorrect predictions can significantly improve the classifier performance. We finally train the classifier on all available data and re-annotate the whole data set.
引用
收藏
页码:139 / 153
页数:15
相关论文
共 50 条
  • [1] Annotating and Modeling Fine-grained Factuality in Summarization
    Goyal, Tanya
    Durrett, Greg
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1449 - 1462
  • [2] Fine-Grained Category Generation for Sets of Entities
    Du, Yexing
    Yu, Jifan
    Wan, Jing
    Xu, Jianjun
    Hou, Lei
    [J]. WEB AND BIG DATA, PT IV, APWEB-WAIM 2023, 2024, 14334 : 390 - 405
  • [3] Fine-grained Typing of Emerging Entities in Microblogs
    Akasaki, Satoshi
    Yoshinaga, Naoki
    Toyoda, Masashi
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 4667 - 4679
  • [4] Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization
    Zhu, Rongxin
    Qi, Jianzhong
    Lau, Jey Han
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 6825 - 6845
  • [5] Retrieval system enhanced by fine-grained knowledge entities
    Jiang, Chuan
    Wang, Dongbo
    Shen, Si
    [J]. Proceedings of the Association for Information Science and Technology, 2019, 56 (01): : 677 - 678
  • [6] Strengthening Component Architectures by Modeling Fine-grained Entities
    Bures, Tomas
    Jezek, Pavel
    Malohlava, Michal
    Poch, Tomas
    Sery, Ondrej
    [J]. 2011 37TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA 2011), 2011, : 124 - 128
  • [7] Fine-grained concrete with various types of fibers
    Begich, Y. E.
    Klyuev, S., V
    Jos, V. A.
    Cherkashin, A., V
    [J]. MAGAZINE OF CIVIL ENGINEERING, 2020, 97 (05):
  • [8] Fine-Grained Entity Typing for Relation-Sparsity Entities
    Niu, Lei
    Gu, Binbin
    Li, Zhixu
    Chen, Wei
    He, Ying
    Zhang, Zhaoyin
    Chen, Zhigang
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT II, 2020, 12113 : 141 - 157
  • [9] Fine-Grained Crowdsourcing for Fine-Grained Recognition
    Jia Deng
    Krause, Jonathan
    Li Fei-Fei
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 580 - 587
  • [10] Modeling Fine-Grained Entity Types with Box Embeddings
    Onoe, Yasumasa
    Boratko, Michael
    McCallum, Andrew
    Durrett, Greg
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2051 - 2064