Neural Data Augmentation for Legal Overruling Task: Small Deep Learning Models vs. Large Language Models

被引：0

作者：

Sheik, Reshma ^{[1
]}

Sundara, K. P. Siva ^{[2
]}

Nirmala, S. Jaya ^{[1
]}

机构：

[1] Natl Inst Technol, Dept Comp Sci & Engn, Tiruchirapalli 620015, India

[2] Coimbatore Inst Technol, Dept Elect & Commun Engn, Coimbatore 641013, India

来源：

NEURAL PROCESSING LETTERS | 2024年 / 56卷 / 02期

关键词：

Deep learning; Natural language processing; Data augmentation; Legal overruling task; Transformer; Few-shot; GPT-3; Large language models;

D O I：

10.1007/s11063-024-11574-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning models produce impressive results in any natural language processing applications when given a better learning strategy and trained with large labeled datasets. However, the annotation of massive training data is far too expensive, especially in the legal domain, due to the need for trained legal professionals. Data augmentation solves the problem of learning without labeled big data. In this paper, we employ pre-trained language models and prompt engineering to generate large-scale pseudo-labeled data for the legal overruling task using 100 data samples. We train small recurrent and convolutional deep-learning models using this data and fine-tune a few other transformer models. We then evaluate the effectiveness of the models, both with and without data augmentation, using the benchmark dataset and analyze the results. We also test the performance of these models with the state-of-the-art GPT-3 model under few-shot setting. Our experimental findings demonstrate that data augmentation results in better model performance in the legal overruling task than models trained without augmentation. Furthermore, our best-performing deep learning model trained on augmented data outperforms the few-shot GPT-3 by 18% in the F1-score. Additionally, our results highlight that the small neural networks trained with augmented data achieve outcomes comparable to those of other large language models.

引用

下载

页数：21

共 50 条

[41] Open set task augmentation facilitates generalization of deep neural networks trained on small data sets
Wadhah Zai El Amri
Felix Reinhart
Wolfram Schenck
Neural Computing and Applications, 2022, 34 : 6067 - 6083
[42] Simple models vs. deep learning in detecting low ejection fraction from the electrocardiogram
Hughes, John Weston
Somani, Sulaiman
Elias, Pierre
Tooley, James
Rogers, Albert J.
Poterucha, Timothy
Haggerty, Christopher M.
Salerno, Michael
Ouyang, David
Ashley, Euan
Zou, James
Perez, Marco, V
EUROPEAN HEART JOURNAL - DIGITAL HEALTH, 2024, 5 (04): : 427 - 434
[43] Can Small Language Models With Retrieval-Augmented Generation Replace Large Language Models When Learning Computer Science?
Liu, Suqing
Yu, Zezhu
Huang, Feiran
Bulbulia, Yousef
Bergen, Andreas
Liut, Michael
PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024, 2024, : 388 - 393
[44] Large language models as assistance for glaucoma surgical cases: a ChatGPT vs. Google Gemini comparison
Carla, Matteo Mario
Gambini, Gloria
Baldascino, Antonio
Boselli, Francesco
Giannuzzi, Federico
Margollicci, Fabio
Rizzo, Stanislao
GRAEFES ARCHIVE FOR CLINICAL AND EXPERIMENTAL OPHTHALMOLOGY, 2024, 262 (09) : 2945 - 2959
[45] Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models
Vaithilingam, Priyan
Zhang, Tianyi
Glassman, Elena L.
EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022, 2022,
[46] Deception and Lie Detection Using Reduced Linguistic Features, Deep Models and Large Language Models for Transcribed Data
1600, Institute of Electrical and Electronics Engineers Inc.
[47] Harnessing large language models for data-scarce learning of polymer properties
Ning Liu
Siavash Jafarzadeh
Brian Y. Lattimer
Shuna Ni
Jim Lua
Yue Yu
Nature Computational Science, 2025, 5 (3): : 245 - 254
[48] Learning viscoelasticity models from indirect data using deep neural networks
Xu, Kailai
Tartakovsky, Alexandre M.
Burghardt, Jeff
Darve, Eric
COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2021, 387
[49] Advanced deep learning and large language models for suicide ideation detection on social media
Qorich, Mohammed
El Ouazzani, Rajae
PROGRESS IN ARTIFICIAL INTELLIGENCE, 2024, 13 (02) : 135 - 147
[50] Deep Learning and Web Applications Vulnerabilities Detection: An Approach Based on Large Language Models
1600, Science and Information Organization (15):

← 1 2 3 4 5 →