Corpus for Automatic Structuring of Legal Documents

被引:0
|
作者
Kalamkar, Prathamesh [1 ,2 ]
Tiwari, Aman [1 ,2 ]
Agarwal, Astha [1 ,2 ]
Karn, Saurabh [3 ]
Gupta, Smita [3 ]
Raghavan, Vivek [1 ]
Modi, Ashutosh [4 ]
机构
[1] EkStep Fdn, Bangalore, Karnataka, India
[2] Thoughtworks Technol India Pvt Ltd, Bangalore, Karnataka, India
[3] Agami, Bangalore, Karnataka, India
[4] Indian Inst Technol Kanpur IIT K, Kanpur, Uttar Pradesh, India
关键词
Legal NLP; Rhetorical Roles; Legal Document Segmentation; EXTRACTION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In populous countries, pending legal cases have been growing exponentially. There is a need for developing techniques for processing and organizing legal documents. In this paper, we introduce a new corpus for structuring legal documents. In particular, we introduce a corpus of legal judgment documents in English that are segmented into topical and coherent parts. Each of these parts is annotated with a label coming from a list of pre-defined Rhetorical Roles. We develop baseline models for automatically predicting rhetorical roles in a legal document based on the annotated corpus. Further, we show the application of rhetorical roles to improve performance on the tasks of summarization and legal judgment prediction. We release the corpus and baseline model code along with the paper.
引用
收藏
页码:4420 / 4429
页数:10
相关论文
共 50 条
  • [1] Towards Automatic Structuring and Semantic Indexing of Legal Documents
    Koniaris, Marios
    Papastefanatos, George
    Vassiliou, Yannis
    [J]. 20TH PAN-HELLENIC CONFERENCE ON INFORMATICS (PCI 2016), 2016,
  • [2] HLDC: Hindi Legal Documents Corpus
    Kapoor, Arnav
    Dhawan, Mudit
    Goel, Anmol
    Arjun, T. H.
    Bhatnagar, Akshala
    Agrawal, Vibhu
    Agrawal, Amul
    Bhattacharya, Arnab
    Kumaraguru, Ponnurangam
    Modi, Ashutosh
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3521 - 3536
  • [3] The automatic generation of hypertext links in legal documents
    Schweighofer, E
    Scheithauer, D
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, 1996, 1134 : 889 - 898
  • [4] TEACHING THE VOCABULARY OF LEGAL DOCUMENTS: A CORPUS-DRIVEN APPROACH
    Breeze, Ruth
    [J]. ESP TODAY-JOURNAL OF ENGLISH FOR SPECIFIC PURPOSES AT TERTIARY LEVEL, 2015, 3 (01): : 44 - 63
  • [5] Automatic Extraction of Entities and Relation from Legal Documents
    Andrew, Judith Jeyafreeda
    Tannier, Xavier
    [J]. NAMED ENTITIES, 2018, : 1 - 8
  • [6] Automatic Inference of Taxonomy Relationships Among Legal Documents
    Benedetto, Irene
    Cagliero, Luca
    Tarasconi, Francesco
    [J]. NEW TRENDS IN DATABASE AND INFORMATION SYSTEMS, ADBIS 2022, 2022, 1652 : 24 - 33
  • [7] Automatic text structuring and categorization as a first step in summarizing legal cases
    Moens, MF
    Uyttendaele, C
    [J]. INFORMATION PROCESSING & MANAGEMENT, 1997, 33 (06) : 727 - 737
  • [8] Automatic Catchphrase Identification from Legal Court Case Documents
    Mandal, Arpan
    Ghosh, Kripabandhu
    Pal, Arindam
    Ghosh, Saptarshi
    [J]. CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2187 - 2190
  • [9] Automatic detection and analysis of DPP entities in legal contract documents
    Nayak, Shiva Prasad
    Pasumarthi, Suresh
    [J]. 2019 FIRST INTERNATIONAL CONFERENCE ON DIGITAL DATA PROCESSING (DDP), 2019, : 70 - 75
  • [10] Semantic structuring of documents
    Poullet, L
    Pinon, JM
    Calabretto, S
    [J]. PROCEEDINGS OF THE THIRD BASQUE INTERNATIONAL WORKSHOP ON INFORMATION TECHNOLOGY - DATA MANAGEMENT SYSTEMS (BIWIT'97), 1997, : 118 - 124