Corpus for Automatic Structuring of Legal Documents

被引:0
|
作者
Kalamkar, Prathamesh [1 ,2 ]
Tiwari, Aman [1 ,2 ]
Agarwal, Astha [1 ,2 ]
Karn, Saurabh [3 ]
Gupta, Smita [3 ]
Raghavan, Vivek [1 ]
Modi, Ashutosh [4 ]
机构
[1] EkStep Fdn, Bangalore, Karnataka, India
[2] Thoughtworks Technol India Pvt Ltd, Bangalore, Karnataka, India
[3] Agami, Bangalore, Karnataka, India
[4] Indian Inst Technol Kanpur IIT K, Kanpur, Uttar Pradesh, India
关键词
Legal NLP; Rhetorical Roles; Legal Document Segmentation; EXTRACTION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In populous countries, pending legal cases have been growing exponentially. There is a need for developing techniques for processing and organizing legal documents. In this paper, we introduce a new corpus for structuring legal documents. In particular, we introduce a corpus of legal judgment documents in English that are segmented into topical and coherent parts. Each of these parts is annotated with a label coming from a list of pre-defined Rhetorical Roles. We develop baseline models for automatically predicting rhetorical roles in a legal document based on the annotated corpus. Further, we show the application of rhetorical roles to improve performance on the tasks of summarization and legal judgment prediction. We release the corpus and baseline model code along with the paper.
引用
收藏
页码:4420 / 4429
页数:10
相关论文
共 50 条
  • [21] Structuring Legal Institutions
    Dick W.P. Ruiter
    [J]. Law and Philosophy, 1998, 17 (3) : 215 - 232
  • [22] Automatic Building of an Ontology from a Corpus of Text Documents Using Data Mining Tools
    Toledo-Alvarado, J. I.
    Guzman-Arenas, A.
    Martinez-Luna, G. L.
    [J]. JOURNAL OF APPLIED RESEARCH AND TECHNOLOGY, 2012, 10 (03) : 398 - 404
  • [23] Lex2KG: Automatic Conversion of Legal Documents to Knowledge Graph
    Abdurahman, Muhamad
    Darari, Fariz
    Lesmana, Hans
    Hartopo, Muhtar
    Rhesa, Immanuel
    Tobing, Berty Chrismartin Lumban
    [J]. 13TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS 2021), 2021, : 267 - 271
  • [24] Building a corpus of legal argumentation in Japanese judgement documents: towards structure-based summarisation
    Yamada, Hiroaki
    Teufel, Simone
    Tokunaga, Takenobu
    [J]. ARTIFICIAL INTELLIGENCE AND LAW, 2019, 27 (02) : 141 - 170
  • [25] Building a corpus of legal argumentation in Japanese judgement documents: towards structure-based summarisation
    Hiroaki Yamada
    Simone Teufel
    Takenobu Tokunaga
    [J]. Artificial Intelligence and Law, 2019, 27 : 141 - 170
  • [26] AUTOMATIC STRUCTURING OF PROGRAMS
    URSCHLER, G
    [J]. IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1975, 19 (02) : 181 - 194
  • [27] LeSICiN: A Heterogeneous Graph-Based Approach for Automatic Legal Statute Identification from Indian Legal Documents
    Paul, Shounak
    Goyal, Pawan
    Ghosh, Saptarshi
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11139 - 11146
  • [28] Structuring Multimedia Archives with Static Documents
    Lalanne, Denis
    Ingold, Rolf
    [J]. ERCIM NEWS, 2005, (62): : 19 - 20
  • [29] Structuring job documents for electronic delivery
    Eastman, MMR
    Sawers, JR
    Ramdathsingh, J
    [J]. AMMONIA PLANT SAFETY & RELATED FACILITIES, VOL 37, 1997, 37 : 64 - 70
  • [30] K-Box: Automatic Structuring and Exchange of Medical Documents Based on the Clinical Documentation Architecture (CDA)
    Doan, Minh H.
    Lott, Paul-Ludwig
    Vaclavik, Marek
    Ueckert, Frank
    [J]. MEDINFO 2007: PROCEEDINGS OF THE 12TH WORLD CONGRESS ON HEALTH (MEDICAL) INFORMATICS, PTS 1 AND 2: BUILDING SUSTAINABLE HEALTH SYSTEMS, 2007, 129 : 513 - +