Applying Domain Knowledge for Data Quality Assessment in Dermatology

被引:0
|
作者
Igic, Nemanja [1 ]
Terzic, Branko [1 ]
Matic, Milan [2 ]
Ivancevic, Vladimir [1 ]
Lukovic, Ivan [1 ]
机构
[1] Univ Novi Sad, Fac Tech Sci, Novi Sad, Serbia
[2] Univ Novi Sad, Fac Med, Novi Sad, Serbia
关键词
Dermatology; Data quality assessment; Domain knowledge application; INFORMATION-SYSTEMS;
D O I
10.1007/978-3-319-59424-8_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Dermatology Clinic at the Clinical Center of Vojvodina, Novi Sad, Serbia, has actively collected data regarding patients' treatment, health insurance and examinations. These data were stored in documents in the comma-separated values (CSV) format. Since many fields in these documents were presented as free form text or allow null values, there are many data records that are inconsistent with the real-world system. Currently, there is a large need for an analytic system that can analyze these data and find relevant patterns. Since such an analytic system would require clean and accurate data, there is a need to assess data quality. Therefore, a data quality system should be designed and built with a goal of identifying inaccurate records so that they can be aligned with the real-world state. In our approach to data quality assessment, the domain knowledge about data is used to define rules which are then used to evaluate the quality of the data. In this paper, we present the architecture of a data quality system that is used to define and apply these rules. The rules are first defined by a domain expert and then applied to data in order to determine the number of records that do not match the defined rules and identify the exact anomalies in the given records. Also, we present a case study in which we applied this data quality system to the data collected by the Dermatology Clinic.
引用
收藏
页码:147 / 156
页数:10
相关论文
共 50 条
  • [41] Applying for dermatology residency is difficult and expensive
    Tichy, Andrea L.
    Peng, David H.
    Lane, Alfred T.
    JOURNAL OF THE AMERICAN ACADEMY OF DERMATOLOGY, 2012, 66 (04) : 696 - 697
  • [42] Dermatology Assessment
    Maghfour, Jalal
    Jacob, Sharon E.
    JOURNAL OF THE DERMATOLOGY NURSES ASSOCIATION, 2021, 13 (02) : E14 - E15
  • [43] Dermatology Assessment
    Maghfour, Jalal
    Jacob, Sharon E.
    JOURNAL OF THE DERMATOLOGY NURSES ASSOCIATION, 2020, 12 (06) : E16 - E17
  • [44] Applying big data and stream processing to the real estate domain
    Garcia-Gonzalez, Herminio
    Fernandez-Alvarez, Daniel
    Emilio Labra-Gayo, Jose
    Ordonez de Pablos, Patricia
    BEHAVIOUR & INFORMATION TECHNOLOGY, 2019, 38 (09) : 950 - 958
  • [45] Dermatology Assessment
    Maghfour, Jalal
    Jacob, Sharon E.
    JOURNAL OF THE DERMATOLOGY NURSES ASSOCIATION, 2021, 13 (01) : E6 - E8
  • [46] Extracting and applying evaluation criteria for ontology quality assessment
    Kim, Seonghun
    Oh, Sam G.
    LIBRARY HI TECH, 2019, 37 (03) : 338 - 354
  • [47] Applying undertaker cost functions to model quality assessment
    Archie, John
    Karplus, Kevin
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2009, 75 (03) : 550 - 555
  • [48] Quality assessment of butter cookies applying multispectral imaging
    Andresen, Mette S.
    Dissing, Bjorn S.
    Loje, Hanne
    FOOD SCIENCE & NUTRITION, 2013, 1 (04): : 315 - 323
  • [49] VALIDATION AND VERIFICATION OF KADS DATA AND DOMAIN KNOWLEDGE
    ROUGE, A
    LAPICQUE, JY
    BROSSIER, F
    LOZINGUEZ, Y
    EXPERT SYSTEMS WITH APPLICATIONS, 1995, 8 (03) : 333 - 341
  • [50] Improving Data Management Using Domain Knowledge
    Ortiz, Magdalena
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5709 - 5713