Fault-tolerant tile mining

被引:0
|
作者
Lu, Haibing [1 ]
Zhu, Wendong [2 ]
Phan, Joseph [3 ]
Ghiassi, M. [4 ]
Fang, Yi [3 ]
Hong, Yuan [5 ]
He, Xiaoyun [6 ]
机构
[1] Santa Clara Univ, Dept Operat Management & Informat Syst, Santa Clara, CA 95053 USA
[2] Global Energy Interconnect Res Inst North Amer, Santa Clara, CA USA
[3] Santa Clara Univ, Dept Comp Engn, Santa Clara, CA 95053 USA
[4] Santa Clara Univ, Dept Operat Management & Informat Syst, Santa Clara, CA 95053 USA
[5] IIT, Dept Comp Sci, Chicago, IL 60616 USA
[6] Auburn Univ, Dept Informat Syst, Montgomery, AL 36117 USA
关键词
Itemset mining; Fault-tolerant; Optimization; Exact algorithm;
D O I
10.1016/j.eswa.2018.02.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Interesting itemset mining is a fundamental research problem in knowledge management and machine learning. It is intended to identify interesting relations between variables in a database using some measures of interestingness and has a number of applications, including market basket analysis, web usage mining, intrusion detection, and many others. This paper proposes a new interestingness measure, the fault-tolerant tile. That is based on two observations: (1) the length of an itemset can be as important as its frequency; (2) knowledge discovery from real-world datasets calls for fault -tolerant data mining (e.g. extracting fault -tolerant association rules, analyzing noisy datasets). Given a user-defined fault tolerance value, we are interested in finding the maximum/top-k fault-tolerant tiles. Due to the exponential search space of candidate itemsets, both problems are NP-hard. While using some monotonic property to prune search space is a common strategy for interesting itemset mining, no monotonic property is available for this problem. To tackle the challenge, we utilize the branch-and-bound search strategy to analyze the characteristics of candidate itemsets at each searching branch and estimating their bounds. Our experimental results show that our algorithms can effectively analyze real datasets and retrieve meaningful results. (C) 2018 Elsevier Ltd. All rights reserved.
引用
下载
收藏
页码:25 / 42
页数:18
相关论文
共 50 条
  • [1] A study on proportional fault-tolerant data mining
    Lee, Guanling
    Lin, Yuh-Tzu
    2006 INNOVATIONS IN INFORMATION TECHNOLOGY, 2006, : 386 - +
  • [2] On Mining Proportional Fault-Tolerant Frequent Itemsets
    Liu, Shengxin
    Poon, Chung Keung
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2014, PT I, 2014, 8421 : 342 - 356
  • [3] On mining approximate and exact fault-tolerant frequent itemsets
    Shengxin Liu
    Chung Keung Poon
    Knowledge and Information Systems, 2018, 55 : 361 - 391
  • [4] Proportional fault-tolerant data mining with applications to bioinformatics
    Lee, Guanling
    Peng, Sheng-Lung
    Lin, Yuh-Tzu
    INFORMATION SYSTEMS FRONTIERS, 2009, 11 (04) : 461 - 469
  • [5] An Efficient Algorithm for Proportionally Fault-Tolerant Data Mining
    Chen, Tianding
    ADVANCES IN WEB AND NETWORK TECHNOLOGIES, AND INFORMATION MANAGEMENT, PROCEEDINGS, 2007, 4537 : 674 - 683
  • [6] Proportional fault-tolerant data mining with applications to bioinformatics
    Guanling Lee
    Sheng-Lung Peng
    Yuh-Tzu Lin
    Information Systems Frontiers, 2009, 11 : 461 - 469
  • [7] On mining approximate and exact fault-tolerant frequent itemsets
    Liu, Shengxin
    Poon, Chung Keung
    KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 55 (02) : 361 - 391
  • [8] Fault-tolerant polynomial smoother and fault-tolerant differential smoothers
    Hu, Feng
    Sun, Guoji
    Gongcheng Shuxue Xuebao/Chinese Journal of Engineering Mathematics, 2000, 17 (02): : 53 - 57
  • [9] Mining Fault-tolerant Frequent Patterns Efficiently with Powerful Pruning
    Zeng, Jhih-Jie
    Lee, Guanling
    Lee, Chung-Chi
    APPLIED COMPUTING 2008, VOLS 1-3, 2008, : 927 - 931
  • [10] Towards Efficient Mining of Proportional Fault-Tolerant Frequent Itemsets
    Poernomo, Ardian Kristanto
    Gopalkrishnan, Vivekanand
    KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2009, : 697 - 705