A Bounded and Adaptive Memory-Based Approach to Mine Frequent Patterns From Very Large Databases

被引:11
|
作者
Adnan, Muhaimenul [1 ]
Alhajj, Reda [1 ,2 ]
机构
[1] Univ Calgary, Dept Comp Sci, Calgary, AB T2N 1N4, Canada
[2] Global Univ, Dept Comp Sci, Beirut 155085, Lebanon
关键词
Association rules mining (ARM); FP-growth; FP-tree; frequent pattern mining; frequent patterns; index structures; secondary storage; virtual memory management (VMM); TREE;
D O I
10.1109/TSMCB.2010.2048900
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most of the existing methods to solve the problem of association rules mining (ARM) rely on special data structures to project the database (either totally or partially) in the primary memory. Traditionally, these data structures reside in the main memory and rely on the existing paging mechanism of the virtual memory manager (VMM) to handle the storage problem when they go out of the primary memory. Typically, VMM stores the overloaded data into the secondary memory based on some preassumed memory usage criteria. However, this direct and unplanned use of virtual memory results in an unpredictable behavior or thrashing, as depicted by some of the works described in the literature. This problem is tackled in this paper by presenting an ARM model capable of mining a transactional database, regardless of its size and without relying on the underlying VMM; the proposed approach could use only a bounded portion of the primary memory and this gives the opportunity to assign other parts of the main memory to other tasks with different priority. In other words, we propose a specialized memory management system which caters to the needs of the ARM model in such a way that the proposed data structure is constructed in the available allocated primary memory first. If at any point the structure grows out of the allocated memory quota, it is forced to be partially saved on secondary memory. The secondary memory version of the structure is accessed in a block-by-block basis so that both the spatial and temporal localities of the I/O access are optimized. Thus, the proposed framework takes control of the virtual memory access and hence manages the required virtual memory in an optimal way to the best benefit of the mining process to be served. Several clever data structures are used to facilitate these optimizations. Our method has the additional advantage that other tasks of different priorities may run concurrently with the main mining task with as little interference as possible because we do not rely on the default paging mechanism of the VMM. The reported test results demonstrate the applicability and effectiveness of the proposed approach.
引用
收藏
页码:154 / 172
页数:19
相关论文
共 50 条
  • [1] A fundamental approach to discover closed periodic-frequent patterns in very large temporal databases
    Pamalla, Veena
    Rage, Uday Kiran
    Penugonda, Ravikumar
    Palla, Likhitha
    Hayamizu, Yuto
    Goda, Kazuo
    Toyoda, Masashi
    Zettsu, Koji
    Sourabh, Shrivastava
    [J]. APPLIED INTELLIGENCE, 2023, 53 (22) : 27344 - 27373
  • [2] A fundamental approach to discover closed periodic-frequent patterns in very large temporal databases
    Veena Pamalla
    Uday Kiran Rage
    Ravikumar Penugonda
    Likhitha Palla
    Yuto Hayamizu
    Kazuo Goda
    Masashi Toyoda
    Koji Zettsu
    Shrivastava Sourabh
    [J]. Applied Intelligence, 2023, 53 : 27344 - 27373
  • [3] Efficient discovery of periodic-frequent patterns in very large databases
    Kiran, R. Uday
    Kitsuregawa, Masaru
    Reddy, P. Krishna
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2016, 112 : 110 - 121
  • [4] An Adaptive Memory-Based Approach Based on Partial Enumeration
    Bartolini, Enrico
    Maniezzo, Vittorio
    Mingozzi, Aristide
    [J]. LEARNING AND INTELLIGENT OPTIMIZATION, 2008, 5313 : 12 - +
  • [5] Discovering Maximal Periodic-Frequent Patterns in Very Large Temporal Databases
    Kiran, R. Uday
    Watanobe, Yutaka
    Chaudhury, Bhaskar
    Zettsu, Koji
    Toyoda, Masashi
    Kitsuregawa, Masaru
    [J]. 2020 IEEE 7TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2020), 2020, : 11 - 20
  • [6] Discovering Closed Periodic-Frequent Patterns in Very Large Temporal Databases
    Likhitha, P.
    Ravikumar, P.
    Kiran, R. Uday
    Hayamizu, Yuto
    Goda, Kazuo
    Toyoda, Masashi
    Zettsu, Koji
    Shrivastava, Sourabh
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 4700 - 4709
  • [7] H-mine: Hyper-structure mining of frequent patterns in large databases
    Pei, J
    Han, JW
    Lu, HJ
    Nishio, S
    Tang, SW
    Yang, DQ
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 441 - 448
  • [8] HDC: An adaptive buffer replacement algorithm for NAND flash memory-based databases
    Lin, Mingwei
    Chen, Shuyu
    Wang, Guiping
    Wu, Tianshu
    [J]. OPTIK, 2014, 125 (03): : 1167 - 1173
  • [9] A new approach to generate frequent patterns from enterprise databases
    Liu, YC
    Hsu, PY
    [J]. PATTERN RECOGNITION AND DATA MINING, PT 1, PROCEEDINGS, 2005, 3686 : 371 - 380
  • [10] Control of DC motors using adaptive and memory-based approach
    Yang, Z
    Liao, X
    Sun, Z
    Xue, XZ
    Song, YD
    [J]. 2004 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1-3, 2004, : 1 - 6