Atrak: a MapReduce-based data warehouse for big data

被引:0
|
作者
Mohammadhossein Barkhordari
Mahdi Niamanesh
机构
[1] Advance Information System Research Group for Information and Communication Technology Research Centre,
来源
关键词
Big data; MapReduce; Data warehouse; Data locality;
D O I
暂无
中图分类号
学科分类号
摘要
As warehouse data volumes expand, single-node solutions can no longer analyze the immense volume of data. Therefore, it is necessary to use shared nothing architectures such as MapReduce. Inter-node data segmentation in MapReduce creates node connectivity issues, network congestion, improper use of node memory capacity and inefficient processing power. In addition, it is not possible to change dimensions and measures without changing previously stored data and big dimension management. In this paper, a method called Atrak is proposed, which uses a unified data format to make Mapper nodes independent to solve the data management problem mentioned earlier. The proposed method can be applied to star schema data warehouse models with distributive measures. Atrak increases query execution speed by employing node independence and the proper use of MapReduce. The proposed method was compared to established methods such as Hive, Spark-SQL, HadoopDB and Flink. Simulation results confirm improved query execution speed of the proposed method. Using data unification in MapReduce can be used in other fields, such as data mining and graph processing.
引用
收藏
页码:4596 / 4610
页数:14
相关论文
共 50 条
  • [1] Atrak: a MapReduce-based data warehouse for big data
    Barkhordari, Mohammadhossein
    Niamanesh, Mahdi
    [J]. JOURNAL OF SUPERCOMPUTING, 2017, 73 (10): : 4596 - 4610
  • [2] A MapReduce-Based ELM for Regression in Big Data
    Wu, B.
    Yan, T. H.
    Xu, X. S.
    He, B.
    Li, W. H.
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 164 - 173
  • [3] A MapReduce-based Fuzzy Associative Classifier for Big Data
    Ducange, Pietro
    Marcelloni, Francesco
    Segatori, Armando
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [4] Verifying Properties of MapReduce-Based Big Data Processing
    Zhang, Nan
    Wang, Meng
    Duan, Zhenhua
    Tian, Cong
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 321 - 338
  • [5] MapReduce-based storage and indexing for big health data
    Gayathiri, N. R.
    Natarajan, A. M.
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (14):
  • [6] An Accelerated MapReduce-Based K-prototypes for Big Data
    Ben HajKacem, Mohamed Aymen
    Ben N'cir, Chiheb-Eddine
    Essoussi, Nadia
    [J]. SOFTWARE TECHNOLOGIES: APPLICATIONS AND FOUNDATIONS (STAF 2016), 2016, 9946 : 13 - 25
  • [7] A MapReduce-based approach to social network big data mining
    Qi, Fuli
    [J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (05) : 2535 - 2547
  • [8] A MapReduce-based scalable discovery and indexing of structured big data
    Singh, Hari
    Bawa, Seema
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 73 : 32 - 43
  • [9] A MapReduce-based Approach to Scale Big Semantic Data Compression with HDT
    Gimenez, J. M.
    Fernandez, J. D.
    Martinez, M. A.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2017, 15 (07) : 1270 - 1277
  • [10] MapReduce-based K-Prototypes Clustering Method for Big Data
    Ben HajKacem, Mohamed Aymen
    Ben N'cir, Chiheb-Eddine
    Essoussi, Nadia
    [J]. PROCEEDINGS OF THE 2015 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (IEEE DSAA 2015), 2015, : 1030 - 1036