Combining shallow and deep learning approaches against data scarcity in legal domains

被引:5
|
作者
Sovrano, Francesco [1 ]
Palmirani, Monica [2 ]
Vitali, Fabio [1 ]
机构
[1] Univ Bologna, DISI, Bologna, Italy
[2] Univ Bologna, CIRSFID AI, Bologna, Italy
关键词
Data scarcity; Deep learning; TF-IDF; Syntagmatic relations; Law; KNOWLEDGE; WORLD; HYPERMEDIA; GOVERNMENT; ONTOLOGY; SYSTEMS; WEB; WWW;
D O I
10.1016/j.giq.2022.101715
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
We are recently witnessing a radical shift towards digitisation in many aspects of our daily life, including law, public administration and governance. This has sometimes been done with the aim of reducing costs and human errors by improving data analysis and management, but not without raising major technological challenges. One of these challenges is certainly the need to cope with relatively small amounts of data, without sacrificing performance. Indeed, cutting-edge approaches to (natural) language processing and understanding are often data-hungry, especially those based on deep learning. With this paper we seek to address the problem of data scarcity in automatic Legalese (or legal English) processing and understanding. What we propose is an ensemble of shallow and deep learning techniques called SyntagmTuner, designed to combine the accuracy of deep learning with the ability of shallow learning to work with little data. Our contribution is based on the assumption that Legalese differs from its spoken language in the way the meaning is encoded by the structure of the text and the co-occurrence of words. As result, we show with SyntagmTuner how we can perform important tasks for e-governance, as multi-label classification of the United Nations General Assembly (UNGA) Resolutions or legal question answering, with data-sets of roughly 100 samples or even less.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Combining deep and shallow approaches in parsing German
    Schiehlen, M
    [J]. 41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2003, : 112 - 119
  • [2] Effective deep learning approaches for summarization of legal texts
    Anand, Deepa
    Wagh, Rupali
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (05) : 2141 - 2150
  • [3] Satellite Imagery Classification Using Shallow and Deep Learning Approaches
    Sainos-Vizuett, Michelle
    Hussein Lopez-Nava, Irvin
    [J]. PATTERN RECOGNITION (MCPR 2021), 2021, 12725 : 163 - 172
  • [4] Deep Learning for French Legal Data Categorization
    Hammami, Eya
    Akermi, Imen
    Faiz, Rim
    Boughanem, Mohand
    [J]. MODEL AND DATA ENGINEERING, MEDI 2019, 2019, 11815 : 96 - 105
  • [5] Combining Crowdsourcing and Deep Learning to Explore the Mesoscale Organization of Shallow Convection
    Rasp, Stephan
    Schulz, Hauke
    Bony, Sandrine
    Stevens, Bjorn
    [J]. BULLETIN OF THE AMERICAN METEOROLOGICAL SOCIETY, 2020, 101 (11) : E1980 - E1995
  • [6] A Systematic Review on Data Scarcity Problem in Deep Learning: Solution and Applications
    Bansal, Aayushi
    Sharma, Rewa
    Kathuria, Mamta
    [J]. ACM COMPUTING SURVEYS, 2022, 54 (10S)
  • [7] DEEP LEARNING APPROACHES FOR CLASSIFYING DATA: A REVIEW
    Bikku, Thulasi
    Sree, K. P. N. V. Satya
    [J]. JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2020, 15 (04): : 2580 - 2594
  • [8] Neurocomputing guest editorial for the special issue: Advances in deep and shallow machine learning approaches for handling data irregularities
    Das, Swagatam
    Garcia, Salvador
    Triguero, Isaac
    [J]. NEUROCOMPUTING, 2021, 419 : 259 - 260
  • [9] Evolving Deep Learning Models for Epilepsy Diagnosis in Data Scarcity Context: A Survey
    Aldahr, Raghdah Saem
    Alanazi, Munid
    Ilyas, Mohammad
    [J]. 2022 45TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING, TSP, 2022, : 66 - 73
  • [10] Shallow and Deep Non-IID Learning on Complex Data
    Cao, Longbing
    Yu, Philip S.
    Zhao, Zhilin
    [J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4774 - 4775