Scalable Vertical Mining for Big Data Analytics of Frequent Itemsets

被引：15

作者：

Leung, Carson K. ^{[1
]}

Zhang, Hao ^{[1
]}

Souza, Joglas ^{[1
]}

Lee, Wookey ^{[2
]}

机构：

[1] Univ Manitoba, Winnipeg, MB, Canada

[2] Inha Univ, Incheon, South Korea

来源：

DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2018, PT I | 2018年 / 11029卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Data mining; Knowledge discovery; Frequent patterns; Vertical mining; Big data; Spark; VISUAL ANALYTICS; PATTERNS;

D O I：

10.1007/978-3-319-98809-2_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Advances in technology and the increasing growth of popularity on Internet of Things (IoT) for many applications have produced huge volume of data at a high velocity. These valuable big data can be of a wide variety or different veracity. Embedded in these big data are useful information and valuable knowledge. This leads to data science, which aims to apply big data analytics to mine implicit, previously unknown and potentially useful information from big data. As a popular data analytic task, frequent itemset mining discovers knowledge about sets of frequently co-occurring items in the big data. Such a task has drawn attention in both academia and industry partially due to its practicality in various real-life applications. Existing mining approaches mostly use serial, distributed or parallel algorithms to mine the data horizontally (i.e., on a transaction basis). In this paper, we present an alternative big data analytic approach. Specifically, our scalable algorithm uses the MapReduce programming model that runs in a Spark environment to mine the data vertically (i.e., on an item basis). Evaluation results show the effectiveness of our algorithm in big data analytics of frequent itemsets.

引用

页码：3 / 17

页数：15

共 50 条

[1] Frequent Itemsets Mining for Big Data: A Comparative Analysis
Apiletti, Daniele
Baralis, Elena
Cerquitelli, Tania
Garza, Paolo
Pulvirenti, Fabio
Venturini, Luca
[J]. BIG DATA RESEARCH, 2017, 9 : 67 - 83
[2] Mining Frequent Itemsets with Vertical Data Layout in MapReduce
Jen, Tao-Yuan
Marinica, Claudia
Ghariani, Abir
[J]. INFORMATION SEARCH, INTEGRATION AND PERSONALIZATION, ISIP 2014, 2016, 497 : 66 - 82
[3] The Algorithm for Mining Global Frequent Itemsets based on Big Data
Bo, He
[J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON LOGISTICS, ENGINEERING, MANAGEMENT AND COMPUTER SCIENCE (LEMCS 2015), 2015, 117 : 158 - 161
[4] Scalable algorithm for mining maximal frequent itemsets
Li, QH
Wang, H
He, Y
Jiang, SY
[J]. 2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 143 - 146
[5] A New Approximate Method For Mining Frequent Itemsets From Big Data *
Valiullin, Timur
Huang, Zhexue
Wei, Chenghao
Yin, Jianfei
Wu, Dingming
Egorova, Iuliia
[J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2021, 18 (03) : 641 - 656
[6] Research on Mining Global Maximal Frequent Itemsets for Health Big Data
He, Bo
Pei, Jianhui
[J]. 2017 IEEE 3RD INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC), 2017, : 1143 - 1146
[7] A Scalable Data Analytics Algorithm for Mining Frequent Patterns from Uncertain Data
MacKinnon, Richard Kyle
Leung, Carson Kai-Sang
Tanbeer, Syed K.
[J]. TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, 2014, 8643 : 404 - 416
[8] NECLATCLOSED: A vertical algorithm for mining frequent closed itemsets
Aryabarzan, Nader
Minaei-Bidgoli, Behrouz
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 174
[9] Frequent Itemsets Mining Using Vertical Index List
Sahaphong, Supatra
[J]. 2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 2, 2009, : 480 - 484
[10] Mining frequent itemsets from uncertain data
Chui, Chun-Kit
Kao, Ben
Hung, Edward
[J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 47 - +

← 1 2 3 4 5 →