Discovery of frequent DATALOG patterns

被引:150
|
作者
Dehaspe, L
Toivonen, H
机构
[1] Katholieke Univ Leuven, Dept Comp Sci, B-3001 Heverlee, Belgium
[2] Univ Helsinki, Rolf Nevanlinna Inst, FIN-00014 Helsinki, Finland
[3] Univ Helsinki, Dept Comp Sci, FIN-00014 Helsinki, Finland
基金
芬兰科学院;
关键词
frequent patterns; inductive logic programming; DATALOG queries; association rules; episodes; sequential patterns;
D O I
10.1023/A:1009863704807
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discovery of frequent patterns has been studied in a variety of data mining settings. In its simplest form, known from association rule mining, the task is to discover all frequent itemsets, i.e., all combinations of items that are found in a sufficient number of examples. The fundamental task of association rule and frequent set discovery has been extended in various directions, allowing more useful patterns to be discovered with special purpose algorithms. We present WARMR, a general purpose inductive logic programming algorithm that addresses frequent query discovery: a very general DATALOG formulation of the frequent pattern discovery problem. The motivation for this novel approach is twofold. First, exploratory data mining isi well supported: WARMR offers the flexibility required to experiment with standard and in particular novel settings not supported by special purpose algorithms. Also, application prototypes based on WARMR can be used as benchmarks in the comparison and evaluation of new special purpose algorithms. Second, the unified representation gives insight to the blurred picture of the frequent pattern discovery domain. Within the DATALOG formulation a number of dimensions appear that relink diverged settings. We demonstrate the frequent query approach and its use on two applications, one in alarm analysis, and one in a chemical toxicology domain.
引用
收藏
页码:7 / 36
页数:30
相关论文
共 50 条
  • [1] Discovery of frequent DATALOG patterns
    Luc Dehaspe
    Hannu Toivonen
    [J]. Data Mining and Knowledge Discovery, 1999, 3 : 7 - 36
  • [2] Frequent patterns discovery in SWRL data set
    Yuan, Liu
    Li, Zhanhuai
    Chen, Shiliang
    [J]. Journal of Computational Information Systems, 2008, 4 (01): : 175 - 182
  • [3] Efficient discovery of frequent approximate sequential patterns
    Zhu, Feida
    Yan, Xifeng
    Han, Jiawei
    Yu, Philip S.
    [J]. ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 751 - +
  • [4] MRFP: Discovery Frequent Patterns Using MapReduce Frequent Pattern Growth
    Al-Hamodi, Arkan A. G.
    Lu, Songfeng
    [J]. 2016 INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC), 2016, : 298 - 301
  • [5] Towards discovery of frequent patterns in description logics with rules
    Jozefowska, J
    Lawrynowicz, A
    Lukaszewski, T
    [J]. RULES AND RULE MARKUP LANGUAGES FOR THE SEMANTIC WEB, PROCEEDINGS, 2005, 3791 : 84 - 97
  • [6] Mining of an alarm log to improve the discovery of frequent patterns
    Fessant, F
    Clérot, F
    Dousson, C
    [J]. ADVANCES IN DATA MINING: APPLICATIONS IN IMAGE MINING, MEDICINE AND BIOTECHNOLOGY, MANAGEMENT AND ENVIRONMENTAL CONTROL, AND TELECOMMUNICATIONS, 2004, 3275 : 144 - 152
  • [7] Discovery of frequent distributed event patterns in sensor networks
    Roemer, Kay
    [J]. WIRELESS SENSOR NETWORKS, 2008, 4913 : 106 - +
  • [8] Discovery of user frequent access patterns on Web Usage Mining
    Wang, XD
    Ouyang, YM
    Hu, XG
    Zhang, Y
    [J]. PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, VOL 1, 2004, : 765 - 769
  • [9] Privacy-preserving discovery of frequent patterns in time series
    da Silva, Josenildo Costa
    Klusch, Matthias
    [J]. ADVANCES IN DATA MINING: THEORETICAL ASPECTS AND APPLICATIONS, PROCEEDINGS, 2007, 4597 : 318 - +
  • [10] Frequent Pattern Discovery in Multiple Biological Networks: Patterns and Algorithms
    Li W.
    Hu H.
    Huang Y.
    Li H.
    Mehan M.R.
    Nunez-Iglesias J.
    Xu M.
    Yan X.
    Zhou X.J.
    [J]. Statistics in Biosciences, 2012, 4 (1) : 157 - 176