Neural Data Augmentation for Legal Overruling Task: Small Deep Learning Models vs. Large Language Models

被引:0
|
作者
Sheik, Reshma [1 ]
Sundara, K. P. Siva [2 ]
Nirmala, S. Jaya [1 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Tiruchirapalli 620015, India
[2] Coimbatore Inst Technol, Dept Elect & Commun Engn, Coimbatore 641013, India
关键词
Deep learning; Natural language processing; Data augmentation; Legal overruling task; Transformer; Few-shot; GPT-3; Large language models;
D O I
10.1007/s11063-024-11574-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning models produce impressive results in any natural language processing applications when given a better learning strategy and trained with large labeled datasets. However, the annotation of massive training data is far too expensive, especially in the legal domain, due to the need for trained legal professionals. Data augmentation solves the problem of learning without labeled big data. In this paper, we employ pre-trained language models and prompt engineering to generate large-scale pseudo-labeled data for the legal overruling task using 100 data samples. We train small recurrent and convolutional deep-learning models using this data and fine-tune a few other transformer models. We then evaluate the effectiveness of the models, both with and without data augmentation, using the benchmark dataset and analyze the results. We also test the performance of these models with the state-of-the-art GPT-3 model under few-shot setting. Our experimental findings demonstrate that data augmentation results in better model performance in the legal overruling task than models trained without augmentation. Furthermore, our best-performing deep learning model trained on augmented data outperforms the few-shot GPT-3 by 18% in the F1-score. Additionally, our results highlight that the small neural networks trained with augmented data achieve outcomes comparable to those of other large language models.
引用
下载
收藏
页数:21
相关论文
共 50 条
  • [41] Open set task augmentation facilitates generalization of deep neural networks trained on small data sets
    Wadhah Zai El Amri
    Felix Reinhart
    Wolfram Schenck
    Neural Computing and Applications, 2022, 34 : 6067 - 6083
  • [42] Simple models vs. deep learning in detecting low ejection fraction from the electrocardiogram
    Hughes, John Weston
    Somani, Sulaiman
    Elias, Pierre
    Tooley, James
    Rogers, Albert J.
    Poterucha, Timothy
    Haggerty, Christopher M.
    Salerno, Michael
    Ouyang, David
    Ashley, Euan
    Zou, James
    Perez, Marco, V
    EUROPEAN HEART JOURNAL - DIGITAL HEALTH, 2024, 5 (04): : 427 - 434
  • [43] Can Small Language Models With Retrieval-Augmented Generation Replace Large Language Models When Learning Computer Science?
    Liu, Suqing
    Yu, Zezhu
    Huang, Feiran
    Bulbulia, Yousef
    Bergen, Andreas
    Liut, Michael
    PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024, 2024, : 388 - 393
  • [44] Large language models as assistance for glaucoma surgical cases: a ChatGPT vs. Google Gemini comparison
    Carla, Matteo Mario
    Gambini, Gloria
    Baldascino, Antonio
    Boselli, Francesco
    Giannuzzi, Federico
    Margollicci, Fabio
    Rizzo, Stanislao
    GRAEFES ARCHIVE FOR CLINICAL AND EXPERIMENTAL OPHTHALMOLOGY, 2024, 262 (09) : 2945 - 2959
  • [45] Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models
    Vaithilingam, Priyan
    Zhang, Tianyi
    Glassman, Elena L.
    EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022, 2022,
  • [47] Harnessing large language models for data-scarce learning of polymer properties
    Ning Liu
    Siavash Jafarzadeh
    Brian Y. Lattimer
    Shuna Ni
    Jim Lua
    Yue Yu
    Nature Computational Science, 2025, 5 (3): : 245 - 254
  • [48] Learning viscoelasticity models from indirect data using deep neural networks
    Xu, Kailai
    Tartakovsky, Alexandre M.
    Burghardt, Jeff
    Darve, Eric
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2021, 387
  • [49] Advanced deep learning and large language models for suicide ideation detection on social media
    Qorich, Mohammed
    El Ouazzani, Rajae
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2024, 13 (02) : 135 - 147