Conditional Dependencies: A Principled Approach to Improving Data Quality

被引:0
|
作者
Fan, Wenfei [1 ]
Geerts, Floris [1 ]
Jia, Xibei [1 ]
机构
[1] Univ Edinburgh, Edinburgh EH8 9YL, Midlothian, Scotland
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Real-life data is often dirty and costs billions of pounds to businesses worldwide each year. This paper presents a promising approach to improving data quality. It effectively detects and fixes inconsistencies in real-life data based on conditional dependencies, an extension of database dependencies by enforcing bindings of semantically related data values. It accurately identifies records from unreliable data sources by leveraging relative candidate keys, an extension of keys for relations by supporting similarity and matching operators across relations. In contrast to traditional dependencies that were developed for improving the quality of schema, the revised constraints are proposed to improve the quality of data. These constraints yield practical techniques for data repairing and record matching in a uniform framework.
引用
收藏
页码:8 / 20
页数:13
相关论文
共 50 条
  • [1] Improving the Data Quality of Drug Databases using Conditional Dependencies and Ontologies
    Cure, Olivier
    [J]. ACM JOURNAL OF DATA AND INFORMATION QUALITY, 2012, 4 (01):
  • [2] Improving XML Data Quality with Functional Dependencies
    Tan, Zijing
    Zhang, Liyong
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT I, 2011, 6587 : 450 - 465
  • [3] Semandaq: A Data Quality System Based on Conditional Functional Dependencies
    Fan, Wenfei
    Geerts, Floris
    Jia, Xibei
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02): : 1460 - 1463
  • [4] Conditional functional dependencies for data cleaning
    Bohannon, Philip
    Fan, Wenfei
    Geerts, Floris
    Jia, Xibei
    Kementsietsidis, Anastasios
    [J]. 2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 721 - 730
  • [5] Conditional dependencies in imprecise data handling
    Filipowicz, Wlodzimierz
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 80 - 89
  • [6] Data repair of density-based data cleaning approach using conditional functional dependencies
    Al-Janabi, Samir
    Janicki, Ryszard
    [J]. DATA TECHNOLOGIES AND APPLICATIONS, 2022, 56 (03) : 429 - 446
  • [7] Conditional functional dependencies for capturing data inconsistencies
    Fan, Wenfei
    Geerts, Floris
    Jia, Xibei
    Kementsietsidis, Anastasios
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2008, 33 (02):
  • [8] Improving the Validity of CER through Principled Exploration of Data
    Meyer, Anne-Marie
    Liu, Huan
    Mack, Christina
    Carpenter, William R.
    Brookhart, M. Alan
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2013, 22 : 126 - 127
  • [9] QUALITY ENHANCEMENT AS GROUNDED ACTION: A PRINCIPLED APPROACH
    Gynnild, Vidar
    [J]. ICERI2016: 9TH INTERNATIONAL CONFERENCE OF EDUCATION, RESEARCH AND INNOVATION, 2016, : 49 - 52
  • [10] Semantic of Data Dependencies to Improve the Data Quality
    Zaidi, Houda
    Pollet, Yann
    Boufares, Faouzi
    Kraiem, Naoufel
    [J]. MODEL AND DATA ENGINEERING, MEDI 2015, 2015, 9344 : 53 - 61