Neural Data Augmentation for Legal Overruling Task: Small Deep Learning Models vs. Large Language Models

被引:0
|
作者
Sheik, Reshma [1 ]
Sundara, K. P. Siva [2 ]
Nirmala, S. Jaya [1 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Tiruchirapalli 620015, India
[2] Coimbatore Inst Technol, Dept Elect & Commun Engn, Coimbatore 641013, India
关键词
Deep learning; Natural language processing; Data augmentation; Legal overruling task; Transformer; Few-shot; GPT-3; Large language models;
D O I
10.1007/s11063-024-11574-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning models produce impressive results in any natural language processing applications when given a better learning strategy and trained with large labeled datasets. However, the annotation of massive training data is far too expensive, especially in the legal domain, due to the need for trained legal professionals. Data augmentation solves the problem of learning without labeled big data. In this paper, we employ pre-trained language models and prompt engineering to generate large-scale pseudo-labeled data for the legal overruling task using 100 data samples. We train small recurrent and convolutional deep-learning models using this data and fine-tune a few other transformer models. We then evaluate the effectiveness of the models, both with and without data augmentation, using the benchmark dataset and analyze the results. We also test the performance of these models with the state-of-the-art GPT-3 model under few-shot setting. Our experimental findings demonstrate that data augmentation results in better model performance in the legal overruling task than models trained without augmentation. Furthermore, our best-performing deep learning model trained on augmented data outperforms the few-shot GPT-3 by 18% in the F1-score. Additionally, our results highlight that the small neural networks trained with augmented data achieve outcomes comparable to those of other large language models.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Data Augmentation to Stabilize Image Caption Generation Models in Deep Learning
    Aldabbas, Hamza
    Asad, Muhammad
    Ryalat, Mohammad Hashem
    Malik, Kaleem Razzaq
    Qureshi, Muhammad Zubair Akbar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (10) : 571 - 579
  • [22] Data augmentation to stabilize image caption generation models in deep learning
    Aldabbas H.
    Asad M.
    Ryalat M.H.
    Malik K.R.
    Akbar Qureshi M.Z.
    International Journal of Advanced Computer Science and Applications, 2019, 10 (10): : 571 - 579
  • [23] DATA AUGMENTATION IN TRAINING DEEP LEARNING MODELS FOR MALWARE FAMILY CLASSIFICATION
    Ding Yuxin
    Wang Guangbin
    Ma Yubin
    Ding Haoxuan
    PROCEEDINGS OF 2021 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2021, : 102 - 107
  • [24] Looking Through the Deep Glasses: How Large Language Models Enhance Explainability of Deep Learning Models
    Spitzer, Philipp
    Celis, Sebastian
    Martin, Dominik
    Kuehl, Niklas
    Satzger, Gerhard
    PROCEEDINGS OF THE 2024 CONFERENCE ON MENSCH UND COMPUTER, MUC 2024, 2024, : 566 - 570
  • [25] Morphology aware data augmentation with neural language models for online hybrid ASR
    Tarjan, Balazs
    Fegyo, Tibor
    Mihajlik, Peter
    ACTA LINGUISTICA ACADEMICA, 2022, 69 (04): : 581 - 598
  • [26] Embodied human language models vs. Large Language Models, or why Artificial Intelligence cannot explain the modal be able to
    Torres-Martinez, Sergio
    BIOSEMIOTICS, 2024, 17 (01) : 185 - 209
  • [27] Embodied human language models vs. Large Language Models, or why Artificial Intelligence cannot explain the modal be able to
    Sergio Torres-Martínez
    Biosemiotics, 2024, 17 : 185 - 209
  • [28] Financial sentiment analysis: Classic methods vs. deep learning models
    Karanikola, Aikaterini
    Davrazos, Gregory
    Liapis, Charalampos M.
    Kotsiantis, Sotiris
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2023, 17 (04): : 893 - 915
  • [29] Data Augmentation for Intent Classification with Off-the-shelf Large Language Models
    Sahu, Gaurav
    Rodriguez, Pau
    Laradji, Issam H.
    Atighehchian, Parmida
    Vazquez, David
    Bandanau, Dzmitry
    PROCEEDINGS OF THE 4TH WORKSHOP ON NLP FOR CONVERSATIONAL AI, 2022, : 47 - 57
  • [30] Event extraction based on self-data augmentation with large language models
    Yang, Lishan
    Fan, Xi
    Wang, Xiangyu
    Wang, Xin
    Chen, Qiuju
    MEMETIC COMPUTING, 2025, 17 (01)