Discovering Relaxed Functional Dependencies Based on Multi-Attribute Dominance

被引:21
|
作者
Caruccio, Loredana [1 ]
Deufemia, Vincenzo [1 ]
Naumann, Felix [2 ]
Polese, Giuseppe [1 ]
机构
[1] Univ Salerno, Dept Comp Sci, I-84084 Fisciano, SA, Italy
[2] Univ Potsdam, Hasso Plattner Inst, D-14482 Potsdam, Germany
关键词
Complexity theory; Approximation algorithms; Big Data; Distributed databases; Semantics; Lakes; Functional dependencies; data profiling; data cleansing; EFFICIENT DISCOVERY;
D O I
10.1109/TKDE.2020.2967722
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the advent of big data and data lakes, data are often integrated from multiple sources. Such integrated data are often of poor quality, due to inconsistencies, errors, and so forth. One way to check the quality of data is to infer functional dependencies (fds). However, in many modern applications it might be necessary to extract properties and relationships that are not captured through fds, due to the necessity to admit exceptions, or to consider similarity rather than equality of data values. Relaxed fds (rfds) have been introduced to meet these needs, but their discovery from data adds further complexity to an already complex problem, also due to the necessity of specifying similarity and validity thresholds. We propose Domino, a new discovery algorithm for rfds that exploits the concept of dominance in order to derive similarity thresholds of attribute values while inferring rfds. An experimental evaluation on real datasets demonstrates the discovery performance and the effectiveness of the proposed algorithm.
引用
收藏
页码:3212 / 3228
页数:17
相关论文
共 50 条
  • [41] An improved KNN algorithm based on multi-attribute classification
    Zhang, Hongliang
    Li, Liangjun
    Li, Tienan
    Yang, Feng
    ICIC Express Letters, Part B: Applications, 2011, 2 (05): : 1117 - 1122
  • [42] Multi-attribute Preference Logic
    Hindriks, Koen V.
    Visser, Wietske
    Jonker, Catholijn M.
    PRINCIPLES AND PRACTICE OF MULTI-AGENT SYSTEMS, 2012, 7057 : 181 - 195
  • [43] Multi-attribute procurement contracts
    Li, Zhaolin
    Ryan, Jennifer K.
    Sun, Daewon
    INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2015, 159 : 137 - 146
  • [44] Motif importance measurement based on multi-attribute decision
    Feng, Biao
    Yang, Yunyun
    Zhang, Liao
    Xue, Shuhong
    Xie, Xinlin
    Wang, Jiianrong
    Xie, Gang
    JOURNAL OF COMPLEX NETWORKS, 2022, 10 (04)
  • [45] Multi-attribute proportional representation
    Lang, Jerome
    Skowron, Piotr
    ARTIFICIAL INTELLIGENCE, 2018, 263 : 74 - 106
  • [46] Multi-attribute decisions in construction
    Peldschus, Friedel
    TRANSFORMATIONS IN BUSINESS & ECONOMICS, 2008, 7 (02): : 163 - 165
  • [47] Multi-attribute sequential search
    Bearden, J. Neil
    Connolly, Terry
    ORGANIZATIONAL BEHAVIOR AND HUMAN DECISION PROCESSES, 2007, 103 (01) : 147 - 158
  • [48] Multi-attribute aggregation operators
    Ricci, Roberto Ghiselli
    Mesiar, Radko
    FUZZY SETS AND SYSTEMS, 2011, 181 (01) : 1 - 13
  • [49] Multi-attribute Decision Making Based on Fuzzy Outranking
    Nagata, Kiyoshi
    Amagasa, Michio
    Hirose, Hiroo
    13TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS (CINTI 2012), 2012, : 169 - 174
  • [50] LIKELIHOOD RATIO BASED MULTI-ATTRIBUTE CONTROL CHART
    Gadre, M. P.
    Rattihalli, R. N.
    INTERNATIONAL JOURNAL OF RELIABILITY QUALITY & SAFETY ENGINEERING, 2005, 12 (02): : 149 - 166