CLAP: Collaborative pattern mining for distributed information systems

被引:11
|
作者
Zhu, Xingquan [1 ,2 ]
Li, Bin [1 ]
Wu, Xindong [3 ,4 ]
He, Dan [5 ]
Zhang, Chengqi [1 ]
机构
[1] Univ Technol Sydney, QCIS Ctr, Fac Eng & Info Technol, Sydney, Ultimo 2007, Australia
[2] Florida Atlantic Univ, Dept Comp Sci & Eng, Boca Raton, FL 33431 USA
[3] Univ Vermont, Dept Comp Sci, Burlington, VT 05404 USA
[4] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Peoples R China
[5] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
基金
澳大利亚研究理事会; 美国国家科学基金会;
关键词
Distributed data mining; Distributed association rule mining; Frequent item-sets; Bloom filter; CLASSIFICATION;
D O I
10.1016/j.dss.2011.05.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The purpose of data mining from distributed information systems is usually threefold: (1) identifying locally significant patterns in individual databases; (2) discovering emerging significant patterns after unifying distributed databases in a single view; and (3) finding patterns which follow special relationships across different data collections. While existing research has significantly advanced the techniques for mining local and global patterns (the first two goals), very little attempt has been made to discover patterns across distributed databases (the third goal). Moreover, no framework currently exists to support the mining of all three types of patterns. This paper proposes solutions to discover patterns from distributed databases. More specifically, we consider pattern mining as a query process where the purpose is to discover patterns from distributed databases with patterns' relationships satisfying user specified query constraints. We argue that existing self-contained mining frameworks are neither efficient, nor feasible to fulfill the objective, mainly because their pattern pruning is single-database oriented. To solve the problem, we advocate a cross-database pruning concept and propose a collaborative pattern (CLAP) mining framework with cross-database pruning mechanisms for distributed pattern mining. In CLAP, distributed databases collaboratively exchange pattern information between sites so that each site can leverage information from other sites to gain cross-database pruning. Experimental results show that CLAP fits a niche position, and demonstrate that CLAP not only outperforms its other peers with significant runtime performance gains, but also helps find patterns incapable of being discovered by others. Crown Copyright (C) 2011 Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:40 / 51
页数:12
相关论文
共 50 条
  • [1] A pattern mining approach for information filtering systems
    Yuefeng Li
    Abdulmohsen Algarni
    Yue Xu
    [J]. Information Retrieval, 2011, 14 : 237 - 256
  • [2] A pattern mining approach for information filtering systems
    Li, Yuefeng
    Algarni, Abdulmohsen
    Xu, Yue
    [J]. INFORMATION RETRIEVAL, 2011, 14 (03): : 237 - 256
  • [3] A Framework to Collaborative and Incremental Development of Distributed Information Systems
    Miranda, Mutaleci
    Xexeo, Geraldo
    de Souza, Jano Morcira
    [J]. COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN IV, 2008, 5236 : 273 - 281
  • [4] EDARC: Collaborative Frequent Pattern and Analytical Mining Tool for Exploration of Educational Information
    Sukhija, Karan
    Aggarwal, Naveen
    Jindal, Manish
    [J]. RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 3, 2018, 709 : 251 - 259
  • [5] Information management in distributed collaborative systems: The case of collaboration studio
    Antunes, Francisco
    Melo, Paulo
    Costa, Joao Paulo
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 177 (03) : 1385 - 1399
  • [6] First Steps Towards Process Mining in Distributed Health Information Systems
    Helm, Emmanuel
    Paster, Ferdinand
    [J]. INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2015, 61 (02) : 137 - 142
  • [7] Towards Distributed Convoy Pattern Mining
    Orakzai, Faisal
    Devogele, Thomas
    Calders, Toon
    [J]. 23RD ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2015), 2015,
  • [8] Efficient and Distributed Temporal Pattern Mining
    Ho, Nguyen
    Van Long Ho
    Pedersen, Torben Bach
    Vu, Mai
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 335 - 343
  • [9] Behavior pattern mining: Apply process mining technology to common event logs of information systems
    Song, Jinliang
    Luo, Tiejian
    Chen, Su
    [J]. PROCEEDINGS OF 2008 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL, VOLS 1 AND 2, 2008, : 1800 - 1805
  • [10] Fault-tolerant control of distributed systems by information pattern reconfiguration
    Staroswiecki, M.
    Amani, A. Moradi
    [J]. INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2015, 29 (06) : 671 - 684