The Utility of Context When Extracting Entities From Legal Documents

被引:3
|
作者
Donnelly, Jonathan [1 ]
Roegiest, Adam [1 ]
机构
[1] Kira Syst, Toronto, ON, Canada
关键词
D O I
10.1145/3340531.3412746
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
When reviewing documents for legal tasks such as Mergers and Acquisitions, granular information (such as start dates and exit clauses) need to be identified and extracted. Inspired by previous work in Named Entity Recognition (NER), we investigate how NER techniques can be leveraged to aid lawyers in this review process. Due to the extremely low prevalence of target information in legal documents, we find that the traditional approach of tagging all sentences in a document is inferior, in both effectiveness and data required to train and predict, to using a first-pass layer to identify sentences that are likely to contain the relevant information and then running the more traditional sentence-level sequence tagging. Moreover, we find that such entity-level models can be improved by training on a balanced sample of relevant and non-relevant sentences. We additionally describe the use of our system in production and how its usage by clients means that deep learning architectures tend to be cost inefficient, especially with respect to the necessary time to train models.
引用
收藏
页码:2397 / 2404
页数:8
相关论文
共 50 条
  • [31] Effects of context and discrepancy when reading multiple documents
    Cornelia Schoor
    Jean-François Rouet
    M. Anne Britt
    Reading and Writing, 2023, 36 : 1111 - 1143
  • [33] Extraction and Evaluation of Knowledge Entities from Scientific Documents
    Chengzhi Zhang
    Philipp Mayr
    Wei Lu
    Yi Zhang
    Journal of Data and Information Science, 2021, (03) : 1 - 5
  • [34] Extraction and Evaluation of Knowledge Entities from Scientific Documents
    Zhang, Chengzhi
    Mayr, Philipp
    Lu, Wei
    Zhang, Yi
    JOURNAL OF DATA AND INFORMATION SCIENCE, 2021, 6 (03) : 1 - 5
  • [35] Discovering relations among entities from XML documents
    Wu, Yangyang
    Lei, Qing
    Luo, Wei
    Yokota, Harou
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINGS, 2007, 4571 : 899 - +
  • [36] Extraction and Evaluation of Knowledge Entities from Scientific Documents
    Chengzhi Zhang
    Philipp Mayr
    Wei Lu
    Yi Zhang
    JournalofDataandInformationScience, 2021, 6 (03) : 1 - 5
  • [37] THE UNCONFIGURATION OF THE LEGAL STATUS OF PUBLIC DOMICILIARY SERVICES FROM THE CLASSIFICATION AS PUBLIC ENTITIES OF THE MIXED PUBLIC UTILITY COMPANIES
    Montana Plata, Alberto
    REVISTA DIGITAL DE DERECHO ADMINISTRATIVO, 2010, (03): : 163 - 190
  • [38] Extracting Named Entities from Prophetic Narration Texts (Hadith)
    Harrag, Fouzi
    El-Qawasmeh, Eyas
    Al-Salman, Abdul Malik Salman
    SOFTWARE ENGINEERING AND COMPUTER SYSTEMS, PT 2, 2011, 180 : 289 - +
  • [39] UNSUPERVISED KNOWLEDGE ACQUISITION FOR EXTRACTING NAMED ENTITIES FROM SPEECH
    Bechet, Frederic
    Charton, Eric
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5338 - 5341
  • [40] Combined Classification for Extracting Named Entities from Arabic Texts
    Trabelsi, Feriel Ben Fraj
    Zribi, Chiraz Ben Othmane
    Kouki, Wiem
    2015 FIRST INTERNATIONAL CONFERENCE ON ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2015): ADVANCES IN ARABIC COMPUTATIONAL LINGUISTICS, 2015, : 55 - 60