Memory-Efficient Sequential Pattern Mining with Hybrid Tries

被引:0
|
作者
Hosseininasab, Amin [1 ]
van Hoeve, Willem-Jan [2 ]
Cire, Andre A. [3 ]
机构
[1] Univ Florida, Warrington Coll Business, Gainesville, FL 32611 USA
[2] Carnegie Mellon Univ, Tepper Sch Business, Pittsburgh, PA USA
[3] Univ Toronto, Rotman Sch Management, Toronto, ON, Canada
关键词
Sequential pattern mining; Memory efficiency; Large-scale pattern mining; Trie data set models; GENERATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper develops a memory-efficient approach for Sequential Pattern Mining (SPM), a fundamental topic in knowledge discovery that faces a well-known memory bottleneck for large data sets. Our methodology involves a novel hybrid trie data structure that exploits recurring patterns to compactly store the data set in memory; and a corresponding mining algorithm designed to effectively extract patterns from this compact representation. Numerical results on small to medium-sized real-life test instances show an average improvement of 85% in memory consumption and 49% in computation time compared to the state of the art. For large data sets, our algorithm stands out as the only capable SPM approach within 256GB of system memory, potentially saving 1.7TB in memory consumption.
引用
收藏
页数:29
相关论文
共 50 条
  • [31] A Memory-Efficient and Modular Approach for Large-Scale String Pattern Matching
    Le, Hoang
    Prasanna, Viktor K.
    IEEE TRANSACTIONS ON COMPUTERS, 2013, 62 (05) : 844 - 857
  • [32] e-NSP: Efficient negative sequential pattern mining
    Cao, Longbing
    Dong, Xiangjun
    Zheng, Zhigang
    ARTIFICIAL INTELLIGENCE, 2016, 235 : 156 - 182
  • [33] Towards Efficient Sequential Pattern Mining in Temporal Uncertain Databases
    Ge, Jiaqi
    Xia, Yuni
    Wang, Jian
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, 2015, 9078 : 268 - 279
  • [34] A time- and memory-efficient frequent itemset discovering algorithm for association rule mining
    Ivancsy, Renata
    Vajk, Istvan
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2006, 27 (04) : 270 - 280
  • [35] A Memory-Efficient Hybrid Implicit-Explicit FDTD Method for Electromagnetic Simulation
    Chen, Faxiang
    Li, Kang
    APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY JOURNAL, 2022, 37 (02): : 149 - 155
  • [36] A memory-efficient strategy for exploring the web
    Castillo, Carlos
    Nelli, Alberto
    Panconesi, Alessandro
    2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 680 - +
  • [37] A fast interactive sequential pattern mining algorithm based on memory indexing
    Ren, Jia-Dong
    Zong, Jun-Sheng
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 1082 - +
  • [38] Memory-efficient Parallel Tensor Decompositions
    Baskaran, Muthu
    Henretty, Tom
    Pradelle, Benoit
    Langston, M. Harper
    Bruns-Smith, David
    Ezick, James
    Lethin, Richard
    2017 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2017,
  • [39] Memory-Efficient Assembly Using Flye
    Freire, Borja
    Ladra, Susana
    Parama, Jose R.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (06) : 3564 - 3577
  • [40] A memory-efficient emptiness checking algorithm
    Department of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200030, China
    J. Inf. Comput. Sci., 2006, 4 (803-810):