The Utility of Context When Extracting Entities From Legal Documents

被引:3
|
作者
Donnelly, Jonathan [1 ]
Roegiest, Adam [1 ]
机构
[1] Kira Syst, Toronto, ON, Canada
关键词
D O I
10.1145/3340531.3412746
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
When reviewing documents for legal tasks such as Mergers and Acquisitions, granular information (such as start dates and exit clauses) need to be identified and extracted. Inspired by previous work in Named Entity Recognition (NER), we investigate how NER techniques can be leveraged to aid lawyers in this review process. Due to the extremely low prevalence of target information in legal documents, we find that the traditional approach of tagging all sentences in a document is inferior, in both effectiveness and data required to train and predict, to using a first-pass layer to identify sentences that are likely to contain the relevant information and then running the more traditional sentence-level sequence tagging. Moreover, we find that such entity-level models can be improved by training on a balanced sample of relevant and non-relevant sentences. We additionally describe the use of our system in production and how its usage by clients means that deep learning architectures tend to be cost inefficient, especially with respect to the necessary time to train models.
引用
收藏
页码:2397 / 2404
页数:8
相关论文
共 50 条
  • [21] SciNER: Extracting Named Entities from Scientific Literature
    Hong, Zhi
    Tchoua, Roselyne
    Chard, Kyle
    Foster, Ian
    COMPUTATIONAL SCIENCE - ICCS 2020, PT II, 2020, 12138 : 308 - 321
  • [22] A Study of Extracting Knowledge from Guideline Documents
    Taboada, M.
    Meizoso, M.
    Martinez, D.
    Tellado, S.
    COMPUTER AIDED SYSTEMS THEORY - EUROCAST 2009, 2009, 5717 : 195 - +
  • [23] Extracting digital fingerprints from Chinese documents
    Liu, Guo-Hua
    Ma, Hui-Dong
    Li, Xu
    Liang, Peng
    CIS: 2007 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, PROCEEDINGS, 2007, : 438 - 441
  • [24] Extracting conceptual relationships from specialized documents
    Hui, B
    Yu, E
    DATA & KNOWLEDGE ENGINEERING, 2005, 54 (01) : 29 - 55
  • [25] Extracting conceptual relationships from specialized documents
    Hui, B
    Yu, E
    CONCEPTUAL MODELING - ER 2002, 2002, 2503 : 232 - 246
  • [26] Extracting Topical Phrases from Clinical Documents
    He, Yulan
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2957 - 2963
  • [27] Extracting mathematical expressions from postscript documents
    Department of Precision Machinery and Precision Instrumentation, University of Science and Technology of China, Hefei 230027, China
    不详
    Shu Ju Cai Ji Yu Chu Li, 2008, 4 (454-458):
  • [28] Extracting Time Information from Korean Documents
    Lee, Seung-Dong
    Jeong, Young-Seob
    2023 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, BIGCOMP, 2023, : 407 - 409
  • [29] Extracting mathematical semantics from LATEX documents
    Stuber, J
    van den Brand, M
    PRINCIPLES AND PRACTICE OF SEMANTIC WEB REASONING, 2003, 2901 : 160 - 173
  • [30] Effects of context and discrepancy when reading multiple documents
    Schoor, Cornelia
    Rouet, Jean-Francois
    Britt, M. Anne
    READING AND WRITING, 2023, 36 (05) : 1111 - 1143