Mining algorithms for sequential patterns in parallel: Hash based approach

被引:0
|
作者
Shintani, T [1 ]
Kitsuregawa, M [1 ]
机构
[1] Univ Tokyo, Inst Ind Sci, Minato Ku, Tokyo 106, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we study the problem of mining sequential patterns in a large database of customer transactions. Since finding sequential patterns has to handle a large amount of customer transaction data and requires multiple passes over the database, it is expected that parallel algorithms help to improve the performance significantly. We consider the parallel algorithms for mining sequential patterns on a shared-nothing environment. Three parallel algorithms (Non Partitioned Sequential Pattern Mining(NPSPM), Simply Partitioned Sequential Pattern Mining(SPSPM) and Hash Partitioned Sequential Pattern Mining(HPSPM)) are proposed. In NPSPM, the candidate sequences are just copied among all the nodes, which can lead to memory overflow for large databases. The remaining two algorithms partition the candidate sequences over the nodes, which can efficiently exploit the total system's memory as the number of nodes in increased. If it is partitioned simply, customer transaction data has to be broadcasted to all nodes. HPSPM partitions the candidate sequences among the nodes using hash function, which eliminates the customer transaction data broadcasting and reduces the comparison workload. We describe the implementation of these algorithms on a shared-nothing parallel computer IBM SP2 and its performance evaluation results. Among three algorithms HPSPM attains best performance.
引用
收藏
页码:283 / 294
页数:12
相关论文
共 50 条
  • [1] Hash based parallel algorithms for mining association rules
    Shintani, T
    Kitsuregawa, M
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED INFORMATION SYSTEMS, 1996, : 19 - 30
  • [2] Set-Based Approach in Mining Sequential Patterns
    Gao, Shang
    Alhajj, Reda
    Rokne, Jon
    Guan, Jiwen
    2009 24TH INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2009, : 217 - +
  • [3] Designing incremental mining algorithms of sequential patterns
    Zhou, Bin
    Wu, Quanyuan
    Gao, Hongkui
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2000, 37 (10): : 1160 - 1165
  • [4] Algorithms Sequential & Parallel: A Unified Approach
    Khorasani, Elham S.
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2007, 8 (01): : 141 - 142
  • [5] The PSP approach for mining sequential patterns
    Masseglia, F
    Cathala, F
    Poncelet, P
    PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 1510 : 176 - 184
  • [6] A CP-based approach for mining sequential patterns with quantities
    Kemmar, Amina
    Touati, Chahira
    Lebbah, Yahia
    INTELIGENCIA ARTIFICIAL-IBEROAMERICAL JOURNAL OF ARTIFICIAL INTELLIGENCE, 2023, 26 (71): : 1 - 12
  • [7] Efficient algorithms for mining closed multidimensional sequential patterns
    Boonjing, Veera
    Songram, Panida
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2007, : 749 - +
  • [8] Parallel and Sequential Algorithms for Data Mining Using Inductive Logic
    Skillicorn, David B.
    Wang, Yu
    Knowledge and Information Systems, 2001, Springer Science and Business Media Deutschland GmbH (03) : 405 - 421
  • [9] Mining Closed Sequential Patterns - A Novel Approach
    Rahaman, Sophia Banu
    Shashi, M.
    2012 6TH INTERNATIONAL CONFERENCE ON NEW TRENDS IN INFORMATION SCIENCE, SERVICE SCIENCE AND DATA MINING (ISSDM2012), 2012, : 649 - 653
  • [10] An Approach for Mining Weighted Closed Sequential Patterns
    Raju, V. Purushothama
    Varma, G. P. Saradhi
    2014 FIRST INTERNATIONAL CONFERENCE ON NETWORKS & SOFT COMPUTING (ICNSC), 2014, : 158 - 161