Scalable Vertical Mining for Big Data Analytics of Frequent Itemsets

被引:15
|
作者
Leung, Carson K. [1 ]
Zhang, Hao [1 ]
Souza, Joglas [1 ]
Lee, Wookey [2 ]
机构
[1] Univ Manitoba, Winnipeg, MB, Canada
[2] Inha Univ, Incheon, South Korea
基金
加拿大自然科学与工程研究理事会;
关键词
Data mining; Knowledge discovery; Frequent patterns; Vertical mining; Big data; Spark; VISUAL ANALYTICS; PATTERNS;
D O I
10.1007/978-3-319-98809-2_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Advances in technology and the increasing growth of popularity on Internet of Things (IoT) for many applications have produced huge volume of data at a high velocity. These valuable big data can be of a wide variety or different veracity. Embedded in these big data are useful information and valuable knowledge. This leads to data science, which aims to apply big data analytics to mine implicit, previously unknown and potentially useful information from big data. As a popular data analytic task, frequent itemset mining discovers knowledge about sets of frequently co-occurring items in the big data. Such a task has drawn attention in both academia and industry partially due to its practicality in various real-life applications. Existing mining approaches mostly use serial, distributed or parallel algorithms to mine the data horizontally (i.e., on a transaction basis). In this paper, we present an alternative big data analytic approach. Specifically, our scalable algorithm uses the MapReduce programming model that runs in a Spark environment to mine the data vertically (i.e., on an item basis). Evaluation results show the effectiveness of our algorithm in big data analytics of frequent itemsets.
引用
收藏
页码:3 / 17
页数:15
相关论文
共 50 条
  • [1] Frequent Itemsets Mining for Big Data: A Comparative Analysis
    Apiletti, Daniele
    Baralis, Elena
    Cerquitelli, Tania
    Garza, Paolo
    Pulvirenti, Fabio
    Venturini, Luca
    [J]. BIG DATA RESEARCH, 2017, 9 : 67 - 83
  • [2] Mining Frequent Itemsets with Vertical Data Layout in MapReduce
    Jen, Tao-Yuan
    Marinica, Claudia
    Ghariani, Abir
    [J]. INFORMATION SEARCH, INTEGRATION AND PERSONALIZATION, ISIP 2014, 2016, 497 : 66 - 82
  • [3] The Algorithm for Mining Global Frequent Itemsets based on Big Data
    Bo, He
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON LOGISTICS, ENGINEERING, MANAGEMENT AND COMPUTER SCIENCE (LEMCS 2015), 2015, 117 : 158 - 161
  • [4] Scalable algorithm for mining maximal frequent itemsets
    Li, QH
    Wang, H
    He, Y
    Jiang, SY
    [J]. 2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 143 - 146
  • [5] A New Approximate Method For Mining Frequent Itemsets From Big Data *
    Valiullin, Timur
    Huang, Zhexue
    Wei, Chenghao
    Yin, Jianfei
    Wu, Dingming
    Egorova, Iuliia
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2021, 18 (03) : 641 - 656
  • [6] Research on Mining Global Maximal Frequent Itemsets for Health Big Data
    He, Bo
    Pei, Jianhui
    [J]. 2017 IEEE 3RD INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC), 2017, : 1143 - 1146
  • [7] A Scalable Data Analytics Algorithm for Mining Frequent Patterns from Uncertain Data
    MacKinnon, Richard Kyle
    Leung, Carson Kai-Sang
    Tanbeer, Syed K.
    [J]. TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, 2014, 8643 : 404 - 416
  • [8] NECLATCLOSED: A vertical algorithm for mining frequent closed itemsets
    Aryabarzan, Nader
    Minaei-Bidgoli, Behrouz
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 174
  • [9] Frequent Itemsets Mining Using Vertical Index List
    Sahaphong, Supatra
    [J]. 2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 2, 2009, : 480 - 484
  • [10] Mining frequent itemsets from uncertain data
    Chui, Chun-Kit
    Kao, Ben
    Hung, Edward
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 47 - +