Atrak: a MapReduce-based data warehouse for big data

被引：0

作者：

Mohammadhossein Barkhordari

Mahdi Niamanesh

机构：

[1] Advance Information System Research Group for Information and Communication Technology Research Centre,

来源：

The Journal of Supercomputing | 2017年 / 73卷

关键词：

Big data; MapReduce; Data warehouse; Data locality;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

As warehouse data volumes expand, single-node solutions can no longer analyze the immense volume of data. Therefore, it is necessary to use shared nothing architectures such as MapReduce. Inter-node data segmentation in MapReduce creates node connectivity issues, network congestion, improper use of node memory capacity and inefficient processing power. In addition, it is not possible to change dimensions and measures without changing previously stored data and big dimension management. In this paper, a method called Atrak is proposed, which uses a unified data format to make Mapper nodes independent to solve the data management problem mentioned earlier. The proposed method can be applied to star schema data warehouse models with distributive measures. Atrak increases query execution speed by employing node independence and the proper use of MapReduce. The proposed method was compared to established methods such as Hive, Spark-SQL, HadoopDB and Flink. Simulation results confirm improved query execution speed of the proposed method. Using data unification in MapReduce can be used in other fields, such as data mining and graph processing.

引用

页码：4596 / 4610

页数：14

共 50 条

[1] Atrak: a MapReduce-based data warehouse for big data
Barkhordari, Mohammadhossein
Niamanesh, Mahdi
[J]. JOURNAL OF SUPERCOMPUTING, 2017, 73 (10): : 4596 - 4610
[2] A MapReduce-Based ELM for Regression in Big Data
Wu, B.
Yan, T. H.
Xu, X. S.
He, B.
Li, W. H.
[J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 164 - 173
[3] A MapReduce-based Fuzzy Associative Classifier for Big Data
Ducange, Pietro
Marcelloni, Francesco
Segatori, Armando
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
[4] Verifying Properties of MapReduce-Based Big Data Processing
Zhang, Nan
Wang, Meng
Duan, Zhenhua
Tian, Cong
[J]. IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 321 - 338
[5] MapReduce-based storage and indexing for big health data
Gayathiri, N. R.
Natarajan, A. M.
[J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (14):
[6] An Accelerated MapReduce-Based K-prototypes for Big Data
Ben HajKacem, Mohamed Aymen
Ben N'cir, Chiheb-Eddine
Essoussi, Nadia
[J]. SOFTWARE TECHNOLOGIES: APPLICATIONS AND FOUNDATIONS (STAF 2016), 2016, 9946 : 13 - 25
[7] A MapReduce-based approach to social network big data mining
Qi, Fuli
[J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (05) : 2535 - 2547
[8] A MapReduce-based scalable discovery and indexing of structured big data
Singh, Hari
Bawa, Seema
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 73 : 32 - 43
[9] A MapReduce-based Approach to Scale Big Semantic Data Compression with HDT
Gimenez, J. M.
Fernandez, J. D.
Martinez, M. A.
[J]. IEEE LATIN AMERICA TRANSACTIONS, 2017, 15 (07) : 1270 - 1277
[10] MapReduce-based K-Prototypes Clustering Method for Big Data
Ben HajKacem, Mohamed Aymen
Ben N'cir, Chiheb-Eddine
Essoussi, Nadia
[J]. PROCEEDINGS OF THE 2015 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (IEEE DSAA 2015), 2015, : 1030 - 1036

← 1 2 3 4 5 →