Assisting Drafting of Chinese Legal Documents Using Fine-Tuned Pre-trained Large Language Models

被引:0
|
作者
Lin, Chun-Hsien [1 ]
Cheng, Pu-Jen [1 ]
机构
[1] Natl Taiwan Univ, Dept Comp Sci & Informat Engn, 1,Sect 4,Roosevelt Rd, Taipei 10617, Taiwan
关键词
Chinese legal document drafting; Fine-tuning large language models; Text generation evaluation;
D O I
10.1007/s12626-025-00179-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Fine-tuning pretrained large language models (LLMs) has become a mainstream paradigm for solving downstream natural language processing tasks. However, training a language model for legal applications requires a large corpus of legal documents to enable the language model to learn legal terminology and the particularity of legal formatting. Typical NLP approaches usually rely on manually annotated datasets for training; however, such legal field datasets are difficult to obtain. In this study, a large corpus of public, annotation-free legal documents in Chinese but without word segmentation were used to fine-tune a pretrained LLM to generate content for legal document drafts. Moreover, this was performed locally, ensuring information privacy and improving security. Finally, an evaluation method for the generated documents was developed to enable objectively assessing the quality of the drafts.
引用
收藏
页码:83 / 110
页数:28
相关论文
共 50 条
  • [1] Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large Language Models
    Foley, Myles
    Rawat, Ambrish
    Lee, Taesung
    Hou, Yufang
    Picco, Gabriele
    Zizzo, Giulio
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7423 - 7442
  • [2] Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models
    Huber, Patrick
    Carenini, Giuseppe
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2376 - 2394
  • [3] Small Pre-trained Language Models Can be Fine-tuned as Large Models via Over-Parameterization
    Gao, Ze-Feng
    Zhou, Kun
    Liu, Peiyu
    Zhao, Wayne Xin
    Wen, Ji-Rong
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3819 - 3834
  • [4] Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models
    Manela, Daniel de Vassimon
    Errington, David
    Fisher, Thomas
    van Breugel, Boris
    Minervini, Pasquale
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2232 - 2242
  • [5] Discourse Structure Extraction from Pre-Trained and Fine-Tuned Language Models in Dialogues
    Li, Chuyuan
    Huber, Patrick
    Xiao, Wen
    Amblard, Maxime
    Braud, Chloe
    Carenini, Giuseppe
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 2562 - 2579
  • [6] How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?
    Dong, Xinshuai
    Luu Anh Tuan
    Lin, Min
    Yan, Shuicheng
    Zhang, Hanwang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [7] Fine-Tuned Pre-Trained Model for Script Recognition
    Bisht, Mamta
    Gupta, Richa
    INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2021, 6 (05) : 1297 - 1314
  • [8] Classification of Conversational Sentences Using an Ensemble Pre-Trained Language Model with the Fine-Tuned Parameter
    Sujatha, R.
    Nimala, K.
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 78 (02): : 1669 - 1686
  • [9] Lawformer: A pre-trained language model for Chinese legal long documents
    Xiao, Chaojun
    Hu, Xueyu
    Liu, Zhiyuan
    Tu, Cunchao
    Sun, Maosong
    AI OPEN, 2021, 2 : 79 - 84
  • [10] BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives
    Souza, Frederico Dias
    de Oliveira e Souza Filho, Joao Baptista
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2022, 2022, 13208 : 209 - 218