Selection of Virtual Machines Based on Classification of MapReduce Jobs

被引:1
|
作者
Blaisse, Adam Pasqua [1 ]
Wagner, Zachary Andrew [1 ]
Wu, Jie [1 ]
机构
[1] Temple Univ, Dept Comp & Informat Sci, Philadelphia, PA 19122 USA
关键词
MapReduce; Hadoop; Eucalyptus; Cloud Computing; Virtual Machine;
D O I
10.1109/ICDCSW.2015.25
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The MapReduce Computing paradigm has become a very popular and useful tool since its introduction. Many large companies including Facebook, IBM, Yahoo, Twitter, and Google have found intuitive ways to incorporate MapReduce into their current needs and operations. A driving force of the growth in the popularity of MapReduce is the need for a system to handle and process large data. MapReduce is a distributed system, which can handle large quantities of data by adding more servers to a cluster. With large data sets only getting larger, there has been a need to increase the size of the currently running MapReduce clusters. This growth in the current clusters can lead to some problems. Often, newly added servers are not the same type of server used by a cluster. This is a problem because MapReduce and its open source implementation called Hadoop both assume that the servers in the cluster are all the same. Due to these issues, many researchers in the past have tried to focus on making the scheduling within MapReduce better for heterogeneous clusters. More recently, the idea of cloud computing has become popular. The idea is to run virtual machines within a cluster of servers. Since these machines are virtual, we can spin up as many identical machines as the project calls for. While this seems like a good fix to the heterogeneous MapReduce cluster problem, it leads itself to other issues that we will address. This paper will address a major issue in selecting virtual machines that maximize the speed of a MapReduce job.
引用
收藏
页码:82 / 86
页数:5
相关论文
共 50 条
  • [1] Evaluating MapReduce on Virtual Machines: The Hadoop Case
    Ibrahim, Shadi
    Jin, Hai
    Lu, Lu
    Qi, Li
    Wu, Song
    Shi, Xuanhua
    [J]. CLOUD COMPUTING, PROCEEDINGS, 2009, 5931 : 519 - +
  • [2] CLOUDLET: Towards MapReduce Implementation on Virtual Machines
    Ibrahim, Shadi
    Jin, Hai
    Cheng, Bin
    Cao, Haijun
    Wu, Song
    Qi, Li
    [J]. HPDC'09: 18TH ACM INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, 2009, : 65 - 66
  • [3] TOTAL WEIGHTED TARDINESS FOR SCHEDULING MAPREDUCE JOBS ON PARALLEL BATCH MACHINES
    Wang, Zhaojie
    Zheng, Feifeng
    Xu, Yinfeng
    Liu, Ming
    Sun, Lihua
    [J]. JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2023, 19 (08) : 5953 - 5968
  • [4] LHCb experience with running jobs in virtual machines
    McNab, A.
    Stagni, F.
    Luzzi, C.
    [J]. 21ST INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP2015), PARTS 1-9, 2015, 664
  • [5] Scheduling unrelated parallel machines with optional machines and jobs selection
    Fanjul-Peyro, Luis
    Ruiz, Ruben
    [J]. COMPUTERS & OPERATIONS RESEARCH, 2012, 39 (07) : 1745 - 1753
  • [6] Neural Network Based Classification of Virtual Machines in IaaS
    Patel, Eva
    Mohan, Aalekh
    Kushwaha, Dharmender Singh
    [J]. 2018 5TH IEEE UTTAR PRADESH SECTION INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING (UPCON), 2018, : 80 - 87
  • [7] GENETIC ALGORITHM FOR ENERGY EFFICIENT PLACEMENT OF VIRTUAL MACHINES IN MAPREDUCE BASED CLOUD ENVIRONMENT
    Rao, B. Thirumala
    Reddy, L. S. S.
    Rao, K. Thirupathi
    Kiran, P. Sai
    Reddy, V. Krishna
    [J]. FIFTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE 2012), 2012, : 817 - 825
  • [8] Regression test selection based on intermediate code for virtual machines
    Koju, T
    Takada, S
    Doi, N
    [J]. INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 2003, : 420 - 429
  • [9] STC: Improving the Performance of Virtual Machines Based on Task Classification
    Zhao, Jiancheng
    Zhu, Zhiqiang
    Sun, Lei
    Guo, Songhui
    Wu, Jin
    [J]. TRUSTED COMPUTING AND INFORMATION SECURITY, CTCIS 2019, 2020, 1149 : 86 - 103
  • [10] A MapReduce based approach for classification
    Haldankar, Akash
    Bhowmick, Kiran
    [J]. PROCEEDINGS OF 2016 ONLINE INTERNATIONAL CONFERENCE ON GREEN ENGINEERING AND TECHNOLOGIES (IC-GET), 2016,