Enhancing in-memory efficiency for MapReduce-based data processing

被引:5
|
作者
Veiga, Jorge [1 ]
Exposito, Roberto R. [1 ]
Taboada, Guillermo L. [1 ]
Tourino, Juan [1 ]
机构
[1] Univ A Coruna, Comp Architecture Grp, Campus A Coruna, La Coruna 15071, Spain
关键词
Big data; MapReduce; In-memory computing; Garbage collector (GC); Performance evaluation; PERFORMANCE;
D O I
10.1016/j.jpdc.2018.04.001
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
As the memory capacity of computational systems increases, the in-memory data management of Big Data processing frameworks becomes more crucial for performance. This paper analyzes and improves the memory efficiency of Flame-MR, a framework that accelerates Hadoop applications, providing valuable insight into the impact of memory management on performance. By optimizing memory allocation, the garbage collection overheads and execution times have been reduced by up to 85% and 44%, respectively, on a multi-core cluster. Moreover, different data buffer implementations are evaluated, showing that off heap buffers achieve better results overall. Memory resources are also leveraged by caching intermediate results, improving iterative applications by up to 26%. The memory-enhanced version of Flame-MR has been compared with Hadoop and Spark on the Amazon EC2 cloud platform. The experimental results have shown significant performance benefits reducing Hadoop execution times by up to 65%, while providing very competitive results compared to Spark. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:323 / 338
页数:16
相关论文
共 50 条
  • [1] MapReduce-based Data Processing on IoT
    Satoh, Ichiro
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE (ITHINGS) - 2014 IEEE INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND COMMUNICATIONS (GREENCOM) - 2014 IEEE INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL-SOCIAL COMPUTING (CPS), 2014, : 161 - 168
  • [2] Verifying Properties of MapReduce-Based Big Data Processing
    Zhang, Nan
    Wang, Meng
    Duan, Zhenhua
    Tian, Cong
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 321 - 338
  • [3] Shape Recognition Based on MapReduce and In-Memory Processing on Distributed File System
    Baik, Namkyun
    Hazra, Dipankar
    Bhattacharyya, Debnath
    [J]. INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2018, 11 (02): : 21 - 30
  • [4] Atrak: a MapReduce-based data warehouse for big data
    Barkhordari, Mohammadhossein
    Niamanesh, Mahdi
    [J]. JOURNAL OF SUPERCOMPUTING, 2017, 73 (10): : 4596 - 4610
  • [5] A MapReduce-Based ELM for Regression in Big Data
    Wu, B.
    Yan, T. H.
    Xu, X. S.
    He, B.
    Li, W. H.
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 164 - 173
  • [6] Atrak: a MapReduce-based data warehouse for big data
    Mohammadhossein Barkhordari
    Mahdi Niamanesh
    [J]. The Journal of Supercomputing, 2017, 73 : 4596 - 4610
  • [7] MapReduce-based Image Processing System with Automated Parallelization
    Sozykin, A. V.
    Goldshtein, M. L.
    [J]. BULLETIN OF THE SOUTH URAL STATE UNIVERSITY SERIES-MATHEMATICAL MODELLING PROGRAMMING & COMPUTER SOFTWARE, 2012, (13): : 109 - 118
  • [8] Scaling up MapReduce-based Big Data Processing on Multi-GPU systems
    Jiang, Hai
    Chen, Yi
    Qiao, Zhi
    Weng, Tien-Hsiung
    Li, Kuan-Ching
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (01): : 369 - 383
  • [9] MapReduce-Based Improved Random Forest Model for Massive Educational Data Processing and Classification
    Xu, Wei
    Hoang, Vinh Truong
    [J]. MOBILE NETWORKS & APPLICATIONS, 2021, 26 (01): : 191 - 199
  • [10] MapReduce-Based Improved Random Forest Model for Massive Educational Data Processing and Classification
    Wei Xu
    Vinh Truong Hoang
    [J]. Mobile Networks and Applications, 2021, 26 : 191 - 199