Accelerating Big Data Applications Using Lightweight Virtualization Framework on Enterprise Cloud

被引:0
|
作者
Bhimani, Janki [1 ]
Yang, Zhengyu [1 ]
Leeser, Miriam [1 ]
Mi, Ningfang [1 ]
机构
[1] Northeastern Univ, Dept Elect & Comp Engn, 360 Huntington Ave, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
Virtual Machine (VM); Container; Docker; Apache Spark; Big Data; Cloud Computing; Resource Management; Task Assignment; Workload Evaluation & Estimation; MAPREDUCE;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Hypervisor-based virtualization technology has been successfully used to deploy high-performance and scalable infrastructure for Hadoop, and now Spark applications. Container-based virtualization techniques are becoming an important option, which is increasingly used due to their lightweight operation and better scaling when compared to Virtual Machines (VM). With containerization techniques such as Docker becoming mature and promising better performance, we can use Docker to speed-up big data applications. However, as applications have different behaviors and resource requirements, before replacing traditional hypervisor-based virtual machines with Docker, it is important to analyze and compare performance of applications running in the cloud with VMs and Docker containers. VM provides distributed resource management for different virtual machines running with their own allocated resources, while Docker relies on shared pool of resources among all containers. Here, we investigate the performance of different Apache Spark applications using both Virtual Machines (VM) and Docker containers. While others have looked at Docker's performance, this is the first study that compares these different virtualization frameworks for a big data enterprise cloud environment using Apache Spark. In addition to makespan and execution time, we also analyze different resource utilization (CPU, disk, memory, etc.) by Spark applications. Our results show that Spark using Docker can obtain speed-up of over 10 times when compared to using VM. However, we observe that this may not apply to all applications due to different workload patterns and different resource management schemes performed by virtual machines and containers. Our work can guide application developers, system administrators and researchers to better design and deploy big data applications on their platforms to improve the overall performance.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Developing a government enterprise architecture framework to support the requirements of big and open linked data with the use of cloud computing
    Lnenicka, Martin
    Komarkova, Jitka
    INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2019, 46 : 124 - 141
  • [32] An Optimization Framework for Migrating and Deploying Multiclass Enterprise Applications Into the Cloud
    Li, Shiyong
    Liu, Huan
    Li, Wenzhe
    Sun, Wei
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2023, 16 (02) : 941 - 956
  • [33] Heterogeneous Cloud Framework for Big Data Genome Sequencing
    Wang, Chao
    Li, Xi
    Chen, Peng
    Wang, Aili
    Zhou, Xuehai
    Yu, Hong
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2015, 12 (01) : 166 - 178
  • [34] Regulatory and Policy Framework for Cloud, Big Data in Korea
    Lee, Kure Chel
    2015 INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC), 2015, : 1200 - 1204
  • [35] A Cloud Framework for Big Data Analytics Workflows on Azure
    Marozzo, Fabrizio
    Talia, Domenico
    Trunfio, Paolo
    CLOUD COMPUTING AND BIG DATA, 2013, 23 : 182 - 191
  • [36] Envisioning a New Future for the Enterprise with a Big Data Experience Framework
    McCreary, Faith
    McEwan, Anne
    Schloss, Derrick
    Gomez, Marla
    NEW PERSPECTIVES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1, 2014, 275 : 199 - 209
  • [37] A Mobile Cloud Computing Model Using the Cloudlet Scheme for Big Data Applications
    Tawalbeh, Lo'ai A.
    Bakheder, Waseem
    Song, Houbing
    2016 IEEE FIRST INTERNATIONAL CONFERENCE ON CONNECTED HEALTH: APPLICATIONS, SYSTEMS AND ENGINEERING TECHNOLOGIES (CHASE), 2016, : 73 - 77
  • [38] Digital Transformation of Enterprise Finance under Big Data and Cloud Computing
    Zhang, Feiteng
    Wireless Communications and Mobile Computing, 2022, 2022
  • [39] Enterprise Digital Management Efficiency under Cloud Computing and Big Data
    Tang, Wei
    Yang, Shuili
    SUSTAINABILITY, 2023, 15 (17)
  • [40] Applying a Lightweight Enterprise Architecture Framework for Parcel Data Entry Optimization
    Falcao, Marcos
    Guerra, Grennda
    Neves, Moises
    Sena, Iona
    Santos, Simone
    2018 IEEE 22ND INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE WORKSHOPS (EDOCW 2018), 2018, : 107 - 114