Scalable out-of-core itemset mining

被引:5
|
作者
Baralis, Elena [1 ]
Cerquitelli, Tania [1 ]
Chiusano, Silvia [1 ]
Grand, Alberto [1 ]
机构
[1] Politecn Torino, Dipartimento Automat & Informat, I-10129 Turin, Italy
关键词
Itemset mining; Data mining; INDEX SUPPORT; FREQUENT; ALGORITHMS;
D O I
10.1016/j.ins.2014.08.073
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Itemset mining looks for correlations among data items in large transactional datasets. Traditional in-core mining algorithms do not scale well with huge data volumes, and are hindered by critical issues such as long execution times due to massive memory swap and main-memory exhaustion. This work is aimed at overcoming the scalability issues of existing in-core algorithms by improving their memory usage. A persistent structure, VLDBMine, to compactly store huge transactional datasets on disk and efficiently support large-scale itemset mining is proposed. VLDBMine provides a compact and complete representation of the data, by exploiting two different data structures suitable for diverse data distributions, and includes an appropriate indexing structure, allowing selective data retrieval. Experimental validation, performed on both real and synthetic datasets, shows the compactness of the VLDBMine data structure and the efficiency and scalability on large datasets of the mining algorithms supported by it. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:146 / 162
页数:17
相关论文
共 50 条
  • [1] Scalable Out-of-core OpenSHMEM Library for HPC
    Gomez-Iglesias, Antonio
    Vienne, Jerome
    Hamidouche, Khaled
    Simmons, Christopher S.
    Barth, William L.
    Panda, Dhabaleswar
    [J]. OPENSHMEM AND RELATED TECHNOLOGIES: EXPERIENCES, IMPLEMENTATIONS, AND TECHNOLOGIES, OPENSHMEM 2015, 2015, 9397 : 138 - 153
  • [2] Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models
    Qin, Chengjie
    Torres, Martin
    Rusu, Florin
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (10): : 986 - 997
  • [3] An efficient and scalable parallel algorithm for out-of-core isosurface extraction and rendering
    Wang, Qin
    Jaja, Joseph
    Varshney, Amitabh
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2007, 67 (05) : 592 - 603
  • [4] Kaleido: An Efficient Out-of-core Graph Mining System on A Single Machine
    Zhao, Cheng
    Zhang, Zhibin
    Xu, Peng
    Zheng, Tianqi
    Guo, Jiafeng
    [J]. 2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, : 673 - 684
  • [5] Issues in the design of scalable out-of-core dense symmetric indefinite factorization algorithms
    Strazdins, PE
    [J]. COMPUTATIONAL SICENCE - ICCS 2003, PT III, PROCEEDINGS, 2003, 2659 : 715 - 724
  • [6] GAMER with out-of-core computation
    Schive, Hsi-Yu
    Tsai, Yu-Chih
    Chiueh, Tzihong
    [J]. COMPUTATIONAL STAR FORMATION, 2011, (270): : 401 - 405
  • [7] Out-of-core mls reconstruction
    Fiorin, Valentino
    Cignoni, Paolo
    Scopigno, Roberto
    [J]. PROCEEDINGS OF THE NINTH IASTED INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS AND IMAGING, 2007, : 27 - 34
  • [8] A Cholesky out-of-core factorization
    Castellanos, J. A.
    Larrazabal, G.
    [J]. MATHEMATICAL AND COMPUTER MODELLING, 2013, 57 (9-10) : 2207 - 2222
  • [9] A Review of Scalable Approaches for Frequent Itemset Mining
    Apiletti, Daniele
    Garza, Paolo
    Pulvirenti, Fabio
    [J]. NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS (ADBIS 2015), 2015, 539 : 243 - 247
  • [10] Amy files for out-of-core computations
    Zhang, Y
    Apon, A
    Pulay, P
    [J]. PDPTA'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS 1-4, 2003, : 191 - 197