Towards Automatic Structuring and Semantic Indexing of Legal Documents

被引:7
|
作者
Koniaris, Marios [1 ]
Papastefanatos, George [2 ]
Vassiliou, Yannis [1 ]
机构
[1] Natl Tech Univ Athens, KDBS Lab, Sch ECE, Athens, Greece
[2] Athena Res Ctr, Inst Management Informat Syst, Maroussi, Greece
关键词
Legislation; legal text analysis; natural language processing; WEB;
D O I
10.1145/3003733.3003801
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Over the last years there has been a great increase on the number of freely available legal resources. Portals that allow users to search for legislation, using keywords are now a common place. However, in the vast majority of those portals, legal documents are not stored in a structured format with a rich set of meta data, but in presentation oriented manifestation, making impossible for the end users to inquiry semantics about the documents, such as date of enactment, date of repeal, jurisdiction, etc. or to reuse information and establish an interconnection with similar repositories. In this paper, we present an approach for extracting a machine readable semantic representation of legislation, from unstructured document formats. Our method exploits common formats of legal documents to identify blocks of structural and semantic information and models them according to a popular legal meta-schema. Our proposed method is highly extensible and achieves high accuracy for a variety of legal and para legal documents, especially legislation. Our evaluation results reveal that our methodology can be of great assistance for the automatic structuring and semantic indexing of legal resources.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Corpus for Automatic Structuring of Legal Documents
    Kalamkar, Prathamesh
    Tiwari, Aman
    Agarwal, Astha
    Karn, Saurabh
    Gupta, Smita
    Raghavan, Vivek
    Modi, Ashutosh
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 4420 - 4429
  • [2] Semantic structuring of documents
    Poullet, L
    Pinon, JM
    Calabretto, S
    [J]. PROCEEDINGS OF THE THIRD BASQUE INTERNATIONAL WORKSHOP ON INFORMATION TECHNOLOGY - DATA MANAGEMENT SYSTEMS (BIWIT'97), 1997, : 118 - 124
  • [3] Towards Semantic Quality Control of Automatic Subject Indexing
    Toepfer, Martin
    Seifert, Christin
    [J]. RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES (TPDL 2017), 2017, 10450 : 616 - 619
  • [4] Automatic Semantic Subject Indexing of Web Documents in Highly Inflected Languages
    Sinkkila, Reetta
    Suominen, Osma
    Hyvonen, Eero
    [J]. SEMANTIC WEB: RESEARCH AND APPLICATIONS, PT I, 2011, 6643 : 215 - 229
  • [5] Semantic indexing of multimedia documents
    Leonardi, R
    Migliorati, P
    [J]. IEEE MULTIMEDIA, 2002, 9 (02) : 44 - 51
  • [6] AUTOMATIC INDEXING OF CONNECTED TEXTS OF RETRIEVAL ANNOTATIONS OF DOCUMENTS FOR SEMANTIC INFORMATION SEARCHING
    PASHCHENKO, NA
    [J]. NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1972, (11): : 38 - 45
  • [7] AUTOMATIC INDEXING OF DOCUMENTS AND REQUESTS
    BELONOGOV, GG
    SHEMAKIN, YI
    NOVOSELOV, AP
    CHIRKIN, VA
    RYBAKOV, BP
    [J]. NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 1-ORGANIZATSIYA I METODIKA INFORMATSIONNOI RABOTY, 1973, (07): : 17 - 25
  • [8] A SEMANTIC CLUSTERING APPROACH FOR INDEXING DOCUMENTS
    Osuna-Ontiveros, Daniel
    Lopez-Arevalo, Ivan
    Sosa-Sosa, Victor
    [J]. KDIR 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2011, : 288 - 293
  • [9] LATENT SEMANTIC INDEXING FOR PATENT DOCUMENTS
    Moldovan, Andreea
    Bot, Radu Ioan
    Wanka, Gert
    [J]. INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2005, 15 (04) : 551 - 560
  • [10] Towards Automatic Extracted Semantic Annotation (ESA) for Web Documents
    Al-Namiy, Ala'a Q.
    Majeed, Faris S.
    [J]. 2009 ASIA-PACIFIC CONFERENCE ON INFORMATION PROCESSING (APCIP 2009), VOL 2, PROCEEDINGS, 2009, : 614 - +