Assisting Drafting of Chinese Legal Documents Using Fine-Tuned Pre-trained Large Language Models

被引:0
|
作者
Lin, Chun-Hsien [1 ]
Cheng, Pu-Jen [1 ]
机构
[1] Natl Taiwan Univ, Dept Comp Sci & Informat Engn, 1,Sect 4,Roosevelt Rd, Taipei 10617, Taiwan
关键词
Chinese legal document drafting; Fine-tuning large language models; Text generation evaluation;
D O I
10.1007/s12626-025-00179-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Fine-tuning pretrained large language models (LLMs) has become a mainstream paradigm for solving downstream natural language processing tasks. However, training a language model for legal applications requires a large corpus of legal documents to enable the language model to learn legal terminology and the particularity of legal formatting. Typical NLP approaches usually rely on manually annotated datasets for training; however, such legal field datasets are difficult to obtain. In this study, a large corpus of public, annotation-free legal documents in Chinese but without word segmentation were used to fine-tune a pretrained LLM to generate content for legal document drafts. Moreover, this was performed locally, ensuring information privacy and improving security. Finally, an evaluation method for the generated documents was developed to enable objectively assessing the quality of the drafts.
引用
收藏
页码:83 / 110
页数:28
相关论文
共 50 条
  • [21] Span Fine-tuning for Pre-trained Language Models
    Bao, Rongzhou
    Zhang, Zhuosheng
    Zhao, Hai
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1970 - 1979
  • [22] Revisiting Pre-trained Models for Chinese Natural Language Processing
    Cui, Yiming
    Che, Wanxiang
    Liu, Ting
    Qin, Bing
    Wang, Shijin
    Hu, Guoping
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 657 - 668
  • [23] Automated Smart Contract Vulnerability Detection using Fine-tuned Large Language Models
    Yang, Zhiju
    Man, Gaoyuan
    Yue, Songqing
    6TH INTERNATIONAL CONFERENCE ON BLOCKCHAIN TECHNOLOGY AND APPLICATIONS, ICBTA 2023, 2023, : 19 - 23
  • [24] Automated classification of brain MRI reports using fine-tuned large language models
    Kanzawa, Jun
    Yasaka, Koichiro
    Fujita, Nana
    Fujiwara, Shin
    Abe, Osamu
    NEURORADIOLOGY, 2024, 66 (12) : 2177 - 2183
  • [25] Generating Software Tests for Mobile Applications Using Fine-Tuned Large Language Models
    Hoffmann, Jacob
    Frister, Demian
    PROCEEDINGS OF THE 2024 IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATION OF SOFTWARE TEST, AST 2024, 2024, : 76 - 77
  • [26] Emotional Paraphrasing Using Pre-trained Language Models
    Casas, Jacky
    Torche, Samuel
    Daher, Karl
    Mugellini, Elena
    Abou Khaled, Omar
    2021 9TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2021,
  • [27] Probing Toxic Content in Large Pre-Trained Language Models
    Ousidhoum, Nedjma
    Zhao, Xinran
    Fang, Tianqing
    Song, Yangqiu
    Yeung, Dit-Yan
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4262 - 4274
  • [28] Fine-Tuning Pre-Trained Language Models with Gaze Supervision
    Deng, Shuwen
    Prasse, Paul
    Reich, David R.
    Scheffer, Tobias
    Jager, Lena A.
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 217 - 224
  • [29] On the Effectiveness of Pre-Trained Language Models for Legal Natural Language Processing: An Empirical Study
    Song, Dezhao
    Gao, Sally
    He, Baosheng
    Schilder, Frank
    IEEE ACCESS, 2022, 10 : 75835 - 75858
  • [30] Improving Braille-Chinese translation with jointly trained and pre-trained language models
    Huang, Tianyuan
    Su, Wei
    Liu, Lei
    Cai, Chuan
    Yu, Hailong
    Yuan, Yongna
    DISPLAYS, 2024, 82