Mining of constant conditional functional dependencies based on pruning free itemsets

被引:0
|
作者
Zhou J. [1 ]
Diao X. [1 ]
Cao J. [1 ]
机构
[1] PLA University of Science and Technology, Nanjing
来源
Diao, Xingchun (diaoxch640222@163.com) | 1600年 / Tsinghua University卷 / 56期
关键词
Closed itemset; Conditional functional dependency; Free itemset; Functional dependency; Pruning algorithm;
D O I
10.16511/j.cnki.qhdxxb.2016.21.026
中图分类号
学科分类号
摘要
The search space for discovering constant conditional functional dependencies (CCFDs) is reduced and the efficiency is improved by a series of pruning strategies that optimize the algorithm CFDMiner, which is a popular algorithm for mining CCFDs. Theoretical studies show many invalid and redundant free and closed itemsets for outputting valid CCFDs. Thus, pruning of free itemsets and selecting of corresponding closed itemsets can generate as consistent results as the original algorithm. Tests show that the optimized algorithm has a smaller search space and its efficiency is improved 4~5 fold on true data. © 2016, Press of Tsinghua University. All right reserved.
引用
收藏
页码:253 / 261
页数:8
相关论文
共 20 条
  • [1] Fei C., Miller R.J., Discovering data quality rules, Proceedings of 34th International Conference on Very Large Data Bases, pp. 1166-1177, (2008)
  • [2] Diallo T., Novelli N., Petit J.M., Discovering (frequent) constant conditional functional dependencies, International Journal of Data Mining, Modelling and Management, 4, 3, pp. 205-223, (2012)
  • [3] Fan W., Geerts F., Jia X., Et al., Conditional functional dependencies for capturing data inconsistencies, ACM Transactions on Database Systems, 33, 2, pp. 1-48, (2008)
  • [4] Fan W., Dependencies revisited for improving data quality, Proceedings of 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2008, pp. 159-170, (2008)
  • [5] Liu B., Geng Y., Mining method for data quality detection rules, Pattern Recognition and Artificial Intelligence, 25, 5, pp. 835-844, (2012)
  • [6] Golab L., Karloff H., Korn F., Et al., On generating near-optimal tableaux for conditional functional dependencies, Proceedings of 34th International Conference on Very Large Data Bases, pp. 376-390, (2008)
  • [7] Fan W., Geerts F., Lakshmanan L.V.S., Et al., Discovering conditional functional dependencies, Proceedings of the 25th International Conference on Data Engineering (ICDE), pp. 1231-1234, (2009)
  • [8] Fan W., Geerts F., Li J., Et al., Discovering conditional functional dependencies, IEEE Transactions on Knowledge & Data Engineering, 23, 5, pp. 683-698, (2011)
  • [9] Fan W., Geerts F., Foundations of Data Quality Management, (2012)
  • [10] Li H., Li J., Wong L., Et al., Relative risk and odds ratio: A data mining perspective, Proceedings of 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2005, pp. 368-377, (2005)