Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model

被引：1

作者：

Yuan, Mingruo ^{[1
]}

Kao, Ben ^{[1
]}

Wu, Tien-Hsuan ^{[1
]}

Cheung, Michael M. K. ^{[2
]}

Chan, Henry W. H. ^{[2
]}

Cheung, Anne S. Y. ^{[2
]}

Chan, Felix W. H. ^{[2
]}

Chen, Yongxi ^{[3
]}

机构：

[1] Univ Hong Kong, Dept Comp Sci, Pokfulam, Hong Kong, Peoples R China

[2] Univ Hong Kong, Fac Law, Pokfulam, Hong Kong, Peoples R China

[3] Australian Natl Univ, Coll Law, Canberra, ACT 2601, Australia

来源：

ARTIFICIAL INTELLIGENCE AND LAW | 2024年 / 32卷 / 03期

关键词：

Legal knowledge dissemination; Navigability and comprehensibility of legal information; Machine question generation; Pre-trained language model; READABILITY;

D O I：

10.1007/s10506-023-09367-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Access to legal information is fundamental to access to justice. Yet accessibility refers not only to making legal documents available to the public, but also rendering legal information comprehensible to them. A vexing problem in bringing legal information to the public is how to turn formal legal documents such as legislation and judgments, which are often highly technical, to easily navigable and comprehensible knowledge to those without legal education. In this study, we formulate a three-step approach for bringing legal knowledge to laypersons, tackling the issues of navigability and comprehensibility. First, we translate selected sections of the law into snippets (called CLIC-pages), each being a small piece of article that focuses on explaining certain technical legal concept in layperson's terms. Second, we construct a Legal Question Bank, which is a collection of legal questions whose answers can be found in the CLIC-pages. Third, we design an interactive CLIC Recommender. Given a user's verbal description of a legal situation that requires a legal solution, CRec interprets the user's input and shortlists questions from the question bank that are most likely relevant to the given legal situation and recommends their corresponding CLIC pages where relevant legal knowledge can be found. In this paper we focus on the technical aspects of creating an LQB. We show how large-scale pre-trained language models, such as GPT-3, can be used to generate legal questions. We compare machine-generated questions against human-composed questions and find that MGQs are more scalable, cost-effective, and more diversified, while HCQs are more precise. We also show a prototype of CRec and illustrate through an example how our 3-step approach effectively brings relevant legal knowledge to the public.

引用

页码：769 / 805

页数：37

共 50 条

[1] Large-Scale Relation Learning for Question Answering over Knowledge Bases with Pre-trained Language Models
Yam, Yuanmeng
Li, Rumei
Wang, Sirui
Zhang, Hongzhi
Zan, Daoguang
Zhang, Fuzheng
Wu, Wei
Xu, Weiran
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3653 - 3660
[2] CPM: A large-scale generative Chinese Pre-trained language model
Zhang, Zhengyan
Han, Xu
Zhou, Hao
Ke, Pei
Gu, Yuxian
Ye, Deming
Qin, Yujia
Su, Yusheng
Ji, Haozhe
Guan, Jian
Qi, Fanchao
Wang, Xiaozhi
Zheng, Yanan
Zeng, Guoyang
Cao, Huanqi
Chen, Shengqi
Li, Daixuan
Sun, Zhenbo
Liu, Zhiyuan
Huang, Minlie
Han, Wentao
Tang, Jie
Li, Juanzi
Zhu, Xiaoyan
Sun, Maosong
AI OPEN, 2021, 2 : 93 - 99
[3] Lawformer: A pre-trained language model for Chinese legal long documents
Xiao, Chaojun
Hu, Xueyu
Liu, Zhiyuan
Tu, Cunchao
Sun, Maosong
AI OPEN, 2021, 2 : 79 - 84
[4] Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models
Wu, Qingyang
Zhang, Yichi
Li, Yu
Yu, Zhou
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1292 - 1301
[5] CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case Encoding
Ma, Yixiao
Wu, Yueyue
Su, Weihang
Ai, Qingyao
Liu, Yiqun
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 7134 - 7143
[6] Abstractive Summarization of Korean Legal Cases using Pre-trained Language Models
Yoon, Jiyoung
Junaid, Muhammad
Ali, Sajid
Lee, Jongwuk
PROCEEDINGS OF THE 2022 16TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2022), 2022,
[7] Pre-trained Language Model for Biomedical Question Answering
Yoon, Wonjin
Lee, Jinhyuk
Kim, Donghyeon
Jeong, Minbyul
Kang, Jaewoo
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 1168 : 727 - 740
[8] SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval
Li, Haitao
Ai, Qingyao
Chen, Jia
Dong, Qian
Wu, Yueyue
Liu, Yiqun
Chen, Chong
Tian, Qi
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1035 - 1044
[9] Assisting Drafting of Chinese Legal Documents Using Fine-Tuned Pre-trained Large Language Models
Lin, Chun-Hsien
Cheng, Pu-Jen
REVIEW OF SOCIONETWORK STRATEGIES, 2025, : 83 - 110
[10] Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Modelss
Savelka, Jaromir
Ashley, Kevin D.
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 4273 - 4283

← 1 2 3 4 5 →