Discovering Relaxed Functional Dependencies Based on Multi-Attribute Dominance

被引:21
|
作者
Caruccio, Loredana [1 ]
Deufemia, Vincenzo [1 ]
Naumann, Felix [2 ]
Polese, Giuseppe [1 ]
机构
[1] Univ Salerno, Dept Comp Sci, I-84084 Fisciano, SA, Italy
[2] Univ Potsdam, Hasso Plattner Inst, D-14482 Potsdam, Germany
关键词
Complexity theory; Approximation algorithms; Big Data; Distributed databases; Semantics; Lakes; Functional dependencies; data profiling; data cleansing; EFFICIENT DISCOVERY;
D O I
10.1109/TKDE.2020.2967722
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the advent of big data and data lakes, data are often integrated from multiple sources. Such integrated data are often of poor quality, due to inconsistencies, errors, and so forth. One way to check the quality of data is to infer functional dependencies (fds). However, in many modern applications it might be necessary to extract properties and relationships that are not captured through fds, due to the necessity to admit exceptions, or to consider similarity rather than equality of data values. Relaxed fds (rfds) have been introduced to meet these needs, but their discovery from data adds further complexity to an already complex problem, also due to the necessity of specifying similarity and validity thresholds. We propose Domino, a new discovery algorithm for rfds that exploits the concept of dominance in order to derive similarity thresholds of attribute values while inferring rfds. An experimental evaluation on real datasets demonstrates the discovery performance and the effectiveness of the proposed algorithm.
引用
收藏
页码:3212 / 3228
页数:17
相关论文
共 50 条
  • [1] Discovering Relaxed Functional Dependencies based on Multi-attribute Dominance
    Caruccio, Loredana
    Deufemia, Vincenzo
    Naumann, Felix
    Polese, Giuseppe
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2354 - 2355
  • [2] Association reducts: A framework for mining multi-attribute dependencies
    Slezak, D
    FOUNDATIONS OF INTELLIGENT SYSTEMS, PROCEEDINGS, 2005, 3488 : 354 - 363
  • [3] Cumulative dominance in multi-attribute choice: benefits and limits
    Katsikopoulos, Konstantinos, V
    Egozcue, Martin
    Fuentes Garcia, Luis
    EURO JOURNAL ON DECISION PROCESSES, 2014, 2 (1-2) : 153 - 163
  • [4] STOCHASTIC DOMINANCE RULES FOR MULTI-ATTRIBUTE UTILITY FUNCTIONS
    HUANG, CC
    KIRA, D
    VERTINSKY, I
    REVIEW OF ECONOMIC STUDIES, 1978, 45 (03): : 611 - 615
  • [5] A Multi-attribute Auction Model by Dominance-based Rough Sets Approach
    Zhang, Rong
    Liu, Bin
    Liu, Sifeng
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2010, 7 (04) : 843 - 858
  • [6] A rapid multi-attribute functional map of the brain
    Rutkowski, JS
    Crewther, DP
    Crewther, SG
    AUSTRALIAN JOURNAL OF PSYCHOLOGY, 2003, 55 : 26 - 26
  • [7] Discovering functional dependencies with degrees of satisfaction using attribute pre-scanning
    Department of Management Science and Engineering, School of Economics and Management, Tsinghua University, Beijing 100084, China
    Qinghua Daxue Xuebao, 2009, 6 (920-924):
  • [8] Solving dominance and potential optimality in imprecise multi-attribute additive problems
    Mateos, A
    Jiménez, A
    Ríos-Insua, S
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2003, 79 (02) : 253 - 262
  • [9] Rough analysis method of multi-attribute decision making based on generalized extended dominance relation
    Hu, Ming-Li
    Liu, Si-Feng
    Kongzhi yu Juece/Control and Decision, 2007, 22 (12): : 1347 - 1351
  • [10] Rough analysis model of multi-attribute decision making based on limited similarity dominance relation
    Luo, Gong-Zhi
    Yang, Xiao-Jiang
    Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice, 2009, 29 (09): : 134 - 140