Semantic role extraction in law texts: a comparative analysis of language models for legal information extraction

被引:0
|
作者
Bakker, Roos M. [1 ,2 ]
Schoevers, Akke J. [1 ,3 ]
van Drie, Romy A. N. [1 ]
Schraagen, Marijn P. [3 ]
de Boer, Maaike H. T. [1 ]
机构
[1] TNO, Dept Data Sci, The Hague, Netherlands
[2] Leiden Univ, Ctr Linguist, Leiden, Netherlands
[3] Univ Utrecht, Nat Language Proc, Utrecht, Netherlands
关键词
Semantic role labelling; Large language models; Legislation; Legal information extraction; Legal semantic roles;
D O I
10.1007/s10506-025-09437-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Norms are essential in our society: they dictate how individuals should behave and interact within a community. They can be written down in laws or other written sources. Interpretations often differ; this is where formalisations offer a solution. They express an interpretation of a source of norms in a transparent manner. However, creating these interpretations is labour intensive. Natural language processing techniques can support this process. Previous work showed the potential of transformer-based models for Dutch law texts. In this paper, we (1) introduce a dataset of 2335 English sentences annotated with legal semantic roles conform the Flint framework; (2) fine-tune a collection of language models on this dataset, and (3) query two non-fine-tuned generative large language models (LLMs). This allows us to compare performance of fine-tuned domain-specific, task-specific, and general language models with non-fine-tuned generative LLMs. The results show that models fine-tuned on our dataset have the best performance (accuracy around 0.88). Furthermore, domain-specific models perform better than general models, indicating that domain knowledge is of added value for this task. Finally, different methods of querying LLMs perform unsatisfactorily, with maximum accuracy scores around 0.6. This indicates that for specific tasks, such as this adaptation of semantic role labelling, the process of annotating data and fine-tuning a smaller language model is preferred over querying a generative LLM, especially when domain-specific models are available.
引用
收藏
页数:35
相关论文
共 50 条
  • [31] Natural Language Processing Techniques for the Extraction of Semantic Information in Web Services
    Bravo, Maricela
    Montes, Azucena
    Reyes, Alejandro
    PROCEEDINGS OF THE SPECIAL SESSION OF THE SEVENTH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE - MICAI 2008, 2008, : 53 - 57
  • [32] Information Extraction Model based on Semantic Role and Conceptual Graph
    Yang, Xuanxuan
    Zhang, Lei
    2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2009, : 386 - 389
  • [33] Open Information Extraction from Texts: Part II. Extraction of Semantic Relationships Using Unsupervised Machine Learning
    A. O. Shelmanov
    D. A. Devyatkin
    V. A. Isakov
    I. V. Smirnov
    Scientific and Technical Information Processing, 2020, 47 : 340 - 347
  • [34] Open Information Extraction from Texts: Part II. Extraction of Semantic Relationships Using Unsupervised Machine Learning
    Shelmanov, A. O.
    Devyatkin, D. A.
    Isakov, V. A.
    Smirnov, I., V
    SCIENTIFIC AND TECHNICAL INFORMATION PROCESSING, 2020, 47 (06) : 340 - 347
  • [35] A Comparative Analysis on the Summarization of Legal Texts Using Transformer Models
    Nunez-Robinson, Daniel
    Talavera-Montalto, Jose
    Ugarte, Willy
    ADVANCED RESEARCH IN TECHNOLOGIES, INFORMATION, INNOVATION AND SUSTAINABILITY, ARTIIS 2022, PT I, 2022, 1675 : 372 - 386
  • [36] Information extraction from Greek texts
    Karra, M
    Bekakos, MP
    NEURAL, PARALLEL, AND SCIENTIFIC COMPUTATIONS, VOL 2, PROCEEDINGS, 2002, : 17 - 20
  • [37] Information Extraction of Texts in the Biomedical Domain
    Cotik, Viviana
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 4357 - 4358
  • [38] The impact of semantic class identification and semantic role labeling on natural language answer extraction
    Ofoghi, Bahadorreza
    Yearwood, John
    Ma, Liping
    ADVANCES IN INFORMATION RETRIEVAL, 2008, 4956 : 430 - 437
  • [39] Fractal feature extraction of English language based on semantic analysis
    Yao Z.
    International Journal of Reasoning-based Intelligent Systems, 2022, 14 (04) : 215 - 220
  • [40] A Comparative Study of Large Language Models for Goal Model Extraction
    Siddeshwar, Vaishali
    Alwidian, Sanaa
    Makrehchi, Masoud
    ACM/IEEE 27TH INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS: COMPANION PROCEEDINGS, MODELS 2024, 2024, : 253 - 263