Application of Character-Level Language Models in the Domain of Polish Statutory Law

被引:1
|
作者
Smywinski-Pohl, Aleksander [1 ]
Wrobel, Krzysztof [2 ]
Lasocki, Karol [1 ]
Jungiewicz, Michal [1 ]
机构
[1] AGH Univ Sci & Technol, Krakow, Poland
[2] Jagiellonian Univ, Krakow, Poland
关键词
character-level language models; cross-reference recognition; language modelling; legal text processing; Polish law;
D O I
10.3233/FAIA190328
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Polish statutory law so far is distributed as PDF, HTML and text files, where the structure of the rules and the references to internal and external regulations is provided only implicitly. As a result, automatic processing of the regulations in legal information systems is complicated since the semi-structured text needs to be converted to a structured form. In this research, we show how character-level language models help in this task. We apply them to the problems of detecting the cross-references to structural units (e.g. articles, points, etc.) and detecting the cross-references to statutory laws (titles of laws and ordinances). We obtain 98.7% macro-average F1 in the first problem and 95.8% F1 in the second problem.
引用
收藏
页码:217 / 222
页数:6
相关论文
共 50 条
  • [1] Character-Level Neural Language Modelling in the Clinical Domain
    Kreuzthaler, Markus
    Oleynik, Michel
    Schulz, Stefan
    [J]. DIGITAL PERSONALIZED HEALTH AND MEDICINE, 2020, 270 : 83 - 87
  • [2] Open Domain Question Answering with Character-level Deep Learning Models
    Lei, Kai
    Deng, Yang
    Zhang, Bing
    Shen, Ying
    [J]. 2017 10TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2017, : 30 - 33
  • [3] MALWARE CLASSIFICATION WITH LSTM AND GRU LANGUAGE MODELS AND A CHARACTER-LEVEL CNN
    Athiwaratkun, Ben
    Stokes, Jack W.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2482 - 2486
  • [4] Character Eyes: Seeing Language through Character-Level Taggers
    Pinter, Yuval
    Marone, Marc
    Eisenstein, Jacob
    [J]. BLACKBOXNLP WORKSHOP ON ANALYZING AND INTERPRETING NEURAL NETWORKS FOR NLP AT ACL 2019, 2019, : 95 - 102
  • [5] Character-Level Language Modeling with Recurrent Highway Hypernetworks
    Suarez, Joseph
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [6] OPEN VOCABULARY HANDWRITING RECOGNITION USING COMBINED WORD-LEVEL AND CHARACTER-LEVEL LANGUAGE MODELS
    Kozielski, Michal
    Rybach, David
    Hahn, Stefan
    Schlueter, Ralf
    Ney, Hermann
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8257 - 8261
  • [7] Experiments in Character-level Neural Network Models for Punctuation
    Gale, William
    Parthasarathy, Sarangarajan
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2794 - 2798
  • [8] CHARACTER-LEVEL LANGUAGE MODELING WITH HIERARCHICAL RECURRENT NEURAL NETWORKS
    Hwang, Kyuyeon
    Sung, Wonyong
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5720 - 5724
  • [9] Application of the character-level statistical method in text categorization
    Yang, Zhen
    Nie, Xiangfei
    Xu, Weiran
    Guo, Jun
    [J]. 2006 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, PTS 1 AND 2, PROCEEDINGS, 2006, : 1412 - 1417
  • [10] CBAs: Character-level Backdoor Attacks against Chinese Pre-trained Language Models
    He, Xinyu
    Hao, Fengrui
    Gu, Tianlong
    Chang, Liang
    [J]. ACM TRANSACTIONS ON PRIVACY AND SECURITY, 2024, 27 (03)