Data Quality for Deep Learning of Judgment Documents: An Empirical Study

被引：0

作者：

Liu, Jiawei ^{[1
,2
]}

Wang, Dong ^{[2
]}

Wang, Zhenzhen ^{[2
,3
]}

Chen, Zhenyu ^{[1
,2
]}

机构：

[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Peoples R China

[2] Software Testing Engn Lab Jiangsu Prov, Nanjing, Peoples R China

[3] Jinling Inst Technol, Sch Software, Nanjing, Peoples R China

来源：

SEMANTIC TECHNOLOGY, JIST 2019 | 2020年 / 1157卷

基金：

中国国家自然科学基金;

关键词：

Judgment document; Deep learning; Quality measurement; Natural language processing;

D O I：

10.1007/978-981-15-3412-6_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The revolution in hardware technology has made it possible to obtain high-definition data through highly sophisticated algorithms. Deep learning has emerged and is widely used in various fields, and the judicial area is no exception. As the carrier of the litigation activities, the judgment documents record the process and results of the people's courts, and their quality directly affects the fairness and credibility of the law. To be able to measure the quality of judgment documents, the interpretability of judgment documents has been an indispensable dimension. Unfortunately, due to the various uncontrollable factors during the process, such as data transmission and storage, The data set for training usually has a poor quality. Besides, due to the severe imbalance of the distribution of case data, data augmentation is essential to generate data for low-frequency cases. Based on the existing data set and the application scenarios, we explore data quality issues in four areas. Then we systematically investigate them to figure out their impact on the data set. After that, we compare the four dimensions to find out which one has the most considerable damage to the data set.

引用

页码：43 / 50

页数：8

共 50 条

[1] Data Augmentation for Deep Learning of Judgment Documents
Yan, Ge
Li, Yu
Zhang, Shu
Chen, Zhenyu
INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: BIG DATA AND MACHINE LEARNING, PT II, 2019, 11936 : 232 - 242
[2] Analysis of Criminal Case Judgment Documents Based on Deep Learning
Han, Jinbo
Li, Dakui
Yang, Nanhai
Liu, Zhu
Nan, Qiong
PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON ADVANCED CONTROL, AUTOMATION AND ARTIFICIAL INTELLIGENCE (ACAAI 2018), 2018, 155 : 261 - 264
[3] An Empirical Study on Quality Issues of Deep Learning Platform
Gao, Yanjie
Shi, Xiaoxiang
Lin, Haoxiang
Zhang, Hongyu
Wu, Hao
Li, Rui
Yang, Mao
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE, ICSE-SEIP, 2023, : 455 - 466
[4] Quality Measurement of Judgment Documents
Liu, Jiawei
Wang, Zhenzhen
Yan, Ge
Lian, Hao
2019 COMPANION OF THE 19TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS-C 2019), 2019, : 296 - 299
[5] From Data Quality to Model Quality: An Exploratory Study on Deep Learning
He, Tianxing
Yu, Shengcheng
Wang, Ziyuan
Li, Jieqiong
Chen, Zhenyu
11TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE (INTERNETWARE 2019), 2019,
[6] Deep learning for encrypted traffic classification in the face of data drift: An empirical study
Malekghaini, Navid
Akbari, Elham
Salahuddin, Mohammad A.
Limam, Noura
Boutaba, Raouf
Mathieu, Bertrand
Moteau, Stephanie
Tuffin, Stephane
COMPUTER NETWORKS, 2023, 225
[7] The Scent of Deep Learning Code: An Empirical Study
Jebnoun, Hadhemi
Ben Braiek, Houssem
Rahman, Mohammad Masudur
Khomh, Foutse
2020 IEEE/ACM 17TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2020, : 420 - 430
[8] An Empirical Study on Data Distribution-Aware Test Selection for Deep Learning Enhancement
Hu, Qiang
Guo, Yuejun
Cordy, Maxime
Xie, Xiaofei
Ma, Lei
Papadakis, Mike
Le Traon, Yves
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2022, 31 (04)
[9] Applicability of Deep Learning Models for Stock Price Forecasting An Empirical Study on BANKEX Data
Balaji, A. Jayanth
Ram, D. S. Harish
Nair, Binoy B.
8TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATIONS (ICACC-2018), 2018, 143 : 947 - 953
[10] A deep learning-based quality assessment model of collaboratively edited documents: A case study of Wikipedia
Wang, Ping
Li, Xiaodan
Wu, Renli
JOURNAL OF INFORMATION SCIENCE, 2021, 47 (02) : 176 - 191

← 1 2 3 4 5 →