Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model

被引:1
|
作者
Yuan, Mingruo [1 ]
Kao, Ben [1 ]
Wu, Tien-Hsuan [1 ]
Cheung, Michael M. K. [2 ]
Chan, Henry W. H. [2 ]
Cheung, Anne S. Y. [2 ]
Chan, Felix W. H. [2 ]
Chen, Yongxi [3 ]
机构
[1] Univ Hong Kong, Dept Comp Sci, Pokfulam, Hong Kong, Peoples R China
[2] Univ Hong Kong, Fac Law, Pokfulam, Hong Kong, Peoples R China
[3] Australian Natl Univ, Coll Law, Canberra, ACT 2601, Australia
关键词
Legal knowledge dissemination; Navigability and comprehensibility of legal information; Machine question generation; Pre-trained language model; READABILITY;
D O I
10.1007/s10506-023-09367-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Access to legal information is fundamental to access to justice. Yet accessibility refers not only to making legal documents available to the public, but also rendering legal information comprehensible to them. A vexing problem in bringing legal information to the public is how to turn formal legal documents such as legislation and judgments, which are often highly technical, to easily navigable and comprehensible knowledge to those without legal education. In this study, we formulate a three-step approach for bringing legal knowledge to laypersons, tackling the issues of navigability and comprehensibility. First, we translate selected sections of the law into snippets (called CLIC-pages), each being a small piece of article that focuses on explaining certain technical legal concept in layperson's terms. Second, we construct a Legal Question Bank, which is a collection of legal questions whose answers can be found in the CLIC-pages. Third, we design an interactive CLIC Recommender. Given a user's verbal description of a legal situation that requires a legal solution, CRec interprets the user's input and shortlists questions from the question bank that are most likely relevant to the given legal situation and recommends their corresponding CLIC pages where relevant legal knowledge can be found. In this paper we focus on the technical aspects of creating an LQB. We show how large-scale pre-trained language models, such as GPT-3, can be used to generate legal questions. We compare machine-generated questions against human-composed questions and find that MGQs are more scalable, cost-effective, and more diversified, while HCQs are more precise. We also show a prototype of CRec and illustrate through an example how our 3-step approach effectively brings relevant legal knowledge to the public.
引用
收藏
页码:769 / 805
页数:37
相关论文
共 50 条
  • [1] Large-Scale Relation Learning for Question Answering over Knowledge Bases with Pre-trained Language Models
    Yam, Yuanmeng
    Li, Rumei
    Wang, Sirui
    Zhang, Hongzhi
    Zan, Daoguang
    Zhang, Fuzheng
    Wu, Wei
    Xu, Weiran
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3653 - 3660
  • [2] CPM: A large-scale generative Chinese Pre-trained language model
    Zhang, Zhengyan
    Han, Xu
    Zhou, Hao
    Ke, Pei
    Gu, Yuxian
    Ye, Deming
    Qin, Yujia
    Su, Yusheng
    Ji, Haozhe
    Guan, Jian
    Qi, Fanchao
    Wang, Xiaozhi
    Zheng, Yanan
    Zeng, Guoyang
    Cao, Huanqi
    Chen, Shengqi
    Li, Daixuan
    Sun, Zhenbo
    Liu, Zhiyuan
    Huang, Minlie
    Han, Wentao
    Tang, Jie
    Li, Juanzi
    Zhu, Xiaoyan
    Sun, Maosong
    AI OPEN, 2021, 2 : 93 - 99
  • [3] Lawformer: A pre-trained language model for Chinese legal long documents
    Xiao, Chaojun
    Hu, Xueyu
    Liu, Zhiyuan
    Tu, Cunchao
    Sun, Maosong
    AI OPEN, 2021, 2 : 79 - 84
  • [4] Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models
    Wu, Qingyang
    Zhang, Yichi
    Li, Yu
    Yu, Zhou
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1292 - 1301
  • [5] CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case Encoding
    Ma, Yixiao
    Wu, Yueyue
    Su, Weihang
    Ai, Qingyao
    Liu, Yiqun
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 7134 - 7143
  • [6] Abstractive Summarization of Korean Legal Cases using Pre-trained Language Models
    Yoon, Jiyoung
    Junaid, Muhammad
    Ali, Sajid
    Lee, Jongwuk
    PROCEEDINGS OF THE 2022 16TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2022), 2022,
  • [7] Pre-trained Language Model for Biomedical Question Answering
    Yoon, Wonjin
    Lee, Jinhyuk
    Kim, Donghyeon
    Jeong, Minbyul
    Kang, Jaewoo
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 1168 : 727 - 740
  • [8] SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval
    Li, Haitao
    Ai, Qingyao
    Chen, Jia
    Dong, Qian
    Wu, Yueyue
    Liu, Yiqun
    Chen, Chong
    Tian, Qi
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1035 - 1044
  • [9] Assisting Drafting of Chinese Legal Documents Using Fine-Tuned Pre-trained Large Language Models
    Lin, Chun-Hsien
    Cheng, Pu-Jen
    REVIEW OF SOCIONETWORK STRATEGIES, 2025, : 83 - 110
  • [10] Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Modelss
    Savelka, Jaromir
    Ashley, Kevin D.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 4273 - 4283