Applying Domain Knowledge for Data Quality Assessment in Dermatology

被引:0
|
作者
Igic, Nemanja [1 ]
Terzic, Branko [1 ]
Matic, Milan [2 ]
Ivancevic, Vladimir [1 ]
Lukovic, Ivan [1 ]
机构
[1] Univ Novi Sad, Fac Tech Sci, Novi Sad, Serbia
[2] Univ Novi Sad, Fac Med, Novi Sad, Serbia
关键词
Dermatology; Data quality assessment; Domain knowledge application; INFORMATION-SYSTEMS;
D O I
10.1007/978-3-319-59424-8_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Dermatology Clinic at the Clinical Center of Vojvodina, Novi Sad, Serbia, has actively collected data regarding patients' treatment, health insurance and examinations. These data were stored in documents in the comma-separated values (CSV) format. Since many fields in these documents were presented as free form text or allow null values, there are many data records that are inconsistent with the real-world system. Currently, there is a large need for an analytic system that can analyze these data and find relevant patterns. Since such an analytic system would require clean and accurate data, there is a need to assess data quality. Therefore, a data quality system should be designed and built with a goal of identifying inaccurate records so that they can be aligned with the real-world state. In our approach to data quality assessment, the domain knowledge about data is used to define rules which are then used to evaluate the quality of the data. In this paper, we present the architecture of a data quality system that is used to define and apply these rules. The rules are first defined by a domain expert and then applied to data in order to determine the number of records that do not match the defined rules and identify the exact anomalies in the given records. Also, we present a case study in which we applied this data quality system to the data collected by the Dermatology Clinic.
引用
收藏
页码:147 / 156
页数:10
相关论文
共 50 条
  • [1] The cost of applying to dermatology residency: 2014 data estimates
    Mansouri, Bobbak
    Walker, Gregory D.
    Mitchell, Jenna
    Henderson, R. Andrew
    JOURNAL OF THE AMERICAN ACADEMY OF DERMATOLOGY, 2016, 74 (04) : 754 - 756
  • [2] The cost of applying to dermatology residency: 2014 data estimates
    Mansouri, Bobbak
    Mitchell, Jenna
    Walker, Gregory
    Henderson, R. Andrew
    JOURNAL OF THE AMERICAN ACADEMY OF DERMATOLOGY, 2016, 74 (05) : AB114 - AB114
  • [3] Domain knowledge and data quality perceptions in genome curation work
    Huang, Hong
    JOURNAL OF DOCUMENTATION, 2015, 71 (01) : 116 - 142
  • [4] Applying Undertaker to quality assessment
    Archie, John G.
    Paluszewski, Martin
    Karplus, Kevin
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2009, 77 : 191 - 195
  • [5] Linked Data Crowdsourcing Quality Assessment based on Domain Professionalism
    Yang, Lu
    Huang, Li
    Liu, Zhenzhen
    2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [6] Data Quality Assessment - A Use Case from the Maritime Domain
    Strozyna, Milena
    Filipiak, Dominik
    Wecel, Krzysztof
    BUSINESS INFORMATION SYSTEMS WORKSHOPS (BIS 2020), 2020, 394 : 5 - 20
  • [7] A method for interoperable knowledge-based data quality assessment
    Tute, Erik
    Scheffner, Irina
    Marschollek, Michael
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)
  • [8] A method for interoperable knowledge-based data quality assessment
    Erik Tute
    Irina Scheffner
    Michael Marschollek
    BMC Medical Informatics and Decision Making, 21
  • [9] The Evolution of Quality of Life Assessment and Use in Dermatology
    Chernyshov, Pavel V.
    DERMATOLOGY, 2019, 235 (03) : 167 - 174
  • [10] Applying Domain Knowledge to SLAM using Virtual Measurements
    Trevor, Alexander J. B.
    Rogers, John G.
    Nieto, Carlos
    Christensen, Henrik I.
    2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2010, : 5389 - 5394