H-mine: Hyper-structure mining of frequent patterns in large databases

被引：174

作者：

Pei, J ^{[1
]}

Han, JW ^{[1
]}

Lu, HJ ^{[1
]}

Nishio, S ^{[1
]}

Tang, SW ^{[1
]}

Yang, DQ ^{[1
]}

机构：

[1] Peking Univ, Beijing 100871, Peoples R China

来源：

2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2001年

关键词：

D O I：

10.1109/ICDM.2001.989550

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Methods for efficient mining of frequent patterns have been studied extensively by many researchers. However, the previously proposed methods still encounter some performance bottlenecks when mining databases with different data characteristics, such as dense vs. sparse, long vs, short patterns, memory-based vs. disk-based, etc. In this study, we propose a simple and novel hyper-linked data structure, H-struct and a new mining algorithm, H-mine, which takes advantage of this data structure and dynamically adjusts links in the mining process. A distinct feature of this method is that it has very limited and precisely predictable space overhead and runs really fast in memory-based setting. Moreover, it can be scaled lip to very large databases by database partitioning, and when the data set becomes dense, (conditional) FP-trees can be constructed dynamically as part of the mining process. Our study shows that H-mine has high performance in various kinds of data, outperforms the previously developed algorithms in different settings, and is highly scalable in mining large databases. This study, also proposes a new data mining methodology, space-preserving mining, which may have strong impact in the future development of efficient and scalable data mining methods.

引用

页码：441 / 448

页数：8

共 50 条

[1] Hyper-structure mining of frequent patterns in uncertain data streams
Chandima HewaNadungodage
Yuni Xia
Jaehwan John Lee
Yi-cheng Tu
[J]. Knowledge and Information Systems, 2013, 37 : 219 - 244
[2] H-Mine: Fast and space-pre serving frequent pattern mining in large databases
Pei, Jian
Han, Jiawei
Lu, Hongjun
Nishio, Shojiro
Tang, Shiwei
Yang, Dongqing
[J]. IIE TRANSACTIONS, 2007, 39 (06) : 593 - 605
[3] Hyper-structure mining of frequent patterns in uncertain data streams
HewaNadungodage, Chandima
Xia, Yuni
Lee, Jaehwan John
Tu, Yi-cheng
[J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 37 (01) : 219 - 244
[4] Mining frequent δ-free patterns in large databases
Hébert, C
Crémilleux, B
[J]. DISCOVERY SCIENCE, PROCEEDINGS, 2005, 3735 : 124 - 136
[5] Mining Probabilistically Frequent Sequential Patterns in Large Uncertain Databases
Zhao, Zhou
Yan, Da
Ng, Wilfred
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (05) : 1171 - 1184
[6] An Efficient Algorithm for Mining Maximal Frequent Sequential Patterns in Large Databases
Su, Qiu-bin
Lu, Lu
Cheng, Bin
[J]. 2018 INTERNATIONAL CONFERENCE ON COMMUNICATION, NETWORK AND ARTIFICIAL INTELLIGENCE (CNAI 2018), 2018, : 404 - 410
[7] Incremental mining item sets based on H-mine and XML
Feng, Xingjie
Zhang, Jing
[J]. JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2009, 3 (02) : 271 - 287
[8] Incremental mining item sets based on H-mine and XML
Feng, Xingjie
Zhang, Jing
[J]. DCABES 2007 Proceedings, Vols I and II, 2007, : 786 - 790
[9] Mining Frequent Patterns from Hypergraph Databases
Alam, Md Tanvir
Ahmed, Chowdhury Farhan
Samiullah, Md
Leung, Carson K.
[J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT II, 2021, 12713 : 3 - 15
[10] Mining frequent spatial patterns in image databases
Chen, Wei-Ta
Chen, Yi-Ling
Chen, Ming-Syan
[J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 699 - 703

← 1 2 3 4 5 →