Extracting Definitions from Brazilian Legal Texts

被引:0
|
作者
Ferneda, Edilson [1 ]
do Prado, Hercules Antonio [1 ,2 ]
Batista, Augusto Herrmann [1 ,3 ]
Pinheiro, Marcello Sandi [4 ]
机构
[1] Univ Catolica Brasilia, Grad Program Knowledge & IT Management, SGAN 916 Av W5, BR-70790160 Brasilia, DF, Brazil
[2] Embrapa Management & Strategy Secretariat, BR-7077090 Brasilia, DF, Brazil
[3] Minist Planning Budget & Management, Logist & Informat Technol Secretariat, BR-70046900 Brasilia, DF, Brazil
[4] Univ Fed Rio de Janeiro, COPPE, BR-2194197 Rio De Janeiro, RJ, Brazil
关键词
Information extraction; Definition extraction; Natural Language Processing;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In order to avoid ambiguity and to ensure, as far as possible, a strict interpretation of law, legal texts usually define the specific lexical terms used within their discourse by means of normative rules. With an often large amount of rules in effect in a given domain, extracting these definitions manually would be a costly undertaking. This paper presents an approach to cope with this problem based in a variation of an automated technique of natural language processing of Brazilian Portuguese texts. For the sake of generality, the proposed solution was developed to address the more general problem of building a glossary from domain specific texts that contain definitions amongst their content. This solution was applied to a corpus of texts on the telecommunications regulations domain and the results are reported. The usual pipeline of natural language processing has been followed: preprocessing, segmentation, and part-of-speech tagging. A set of feature extraction functions is specified and used along with reference glossary information on whether or not a text fragment is a definition, to train a SVM classifier. At last, the definitions are extracted from the texts and evaluated upon a testing corpus, which also contains the reference glossary annotations on definitions. The results are then discussed in light of other definition extraction techniques.
引用
收藏
页码:631 / 646
页数:16
相关论文
共 50 条
  • [21] Extracting Terms from Texts with Conditional Random Fields
    Li YiXuan
    Lu Xun
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT, COMPUTER AND SOCIETY, 2016, 37 : 293 - 296
  • [22] Automatic Methods for Extracting Taxonomic Relationships from Texts
    Loukachevitch, N. V.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2023, 33 (03) : 398 - 406
  • [23] Extracting indices from Japanese legal documents
    Tho Thi Ngoc Le
    Shirai, Kiyoaki
    Minh Le Nguyen
    Shimazu, Akira
    ARTIFICIAL INTELLIGENCE AND LAW, 2015, 23 (04) : 315 - 344
  • [24] Automatic Methods for Extracting Taxonomic Relationships from Texts
    N. V. Loukachevitch
    Pattern Recognition and Image Analysis, 2023, 33 : 398 - 406
  • [25] Extracting Medication Information from French Clinical Texts
    Deleger, Louise
    Grouin, Cyril
    Zweigenbaum, Pierre
    MEDINFO 2010, PTS I AND II, 2010, 160 : 949 - 953
  • [26] Extracting causation knowledge from natural language texts
    Chan, K
    Lam, W
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2005, 20 (03) : 327 - 358
  • [27] A Heuristic Strategy for Extracting Terms from Scientific Texts
    Bolshakova, Elena I.
    Efremova, Natalia E.
    ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS, AIST 2015, 2015, 542 : 297 - 307
  • [28] Legal texts
    Gewirtz, P
    REMAPPING THE BOUNDARIES: A NEW PERSPECTIVE IN COMPARATIVE STUDIES, 1997, 16 (01): : 137 - 145
  • [29] LEGAL TEXTS
    不详
    ANUARIO ESPANOL DE DERECHO INTERNACIONAL PRIVADO, 2018, 18 : 755 - 769
  • [30] New legal texts from the Hermeneumata Pseudodositheana
    Dickey, Eleanor
    TIJDSCHRIFT VOOR RECHTSGESCHIEDENIS-REVUE D HISTOIRE DU DROIT-THE LEGAL HISTORY REVIEW, 2014, 82 (1-2): : 30 - 44