Selection of Virtual Machines Based on Classification of MapReduce Jobs

被引:1
|
作者
Blaisse, Adam Pasqua [1 ]
Wagner, Zachary Andrew [1 ]
Wu, Jie [1 ]
机构
[1] Temple Univ, Dept Comp & Informat Sci, Philadelphia, PA 19122 USA
关键词
MapReduce; Hadoop; Eucalyptus; Cloud Computing; Virtual Machine;
D O I
10.1109/ICDCSW.2015.25
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The MapReduce Computing paradigm has become a very popular and useful tool since its introduction. Many large companies including Facebook, IBM, Yahoo, Twitter, and Google have found intuitive ways to incorporate MapReduce into their current needs and operations. A driving force of the growth in the popularity of MapReduce is the need for a system to handle and process large data. MapReduce is a distributed system, which can handle large quantities of data by adding more servers to a cluster. With large data sets only getting larger, there has been a need to increase the size of the currently running MapReduce clusters. This growth in the current clusters can lead to some problems. Often, newly added servers are not the same type of server used by a cluster. This is a problem because MapReduce and its open source implementation called Hadoop both assume that the servers in the cluster are all the same. Due to these issues, many researchers in the past have tried to focus on making the scheduling within MapReduce better for heterogeneous clusters. More recently, the idea of cloud computing has become popular. The idea is to run virtual machines within a cluster of servers. Since these machines are virtual, we can spin up as many identical machines as the project calls for. While this seems like a good fix to the heterogeneous MapReduce cluster problem, it leads itself to other issues that we will address. This paper will address a major issue in selecting virtual machines that maximize the speed of a MapReduce job.
引用
收藏
页码:82 / 86
页数:5
相关论文
共 50 条
  • [41] Jobs, machines, and capitalism
    不详
    [J]. MONTHLY LABOR REVIEW, 1932, 34 (05) : 1255 - 1255
  • [42] JOBS, MACHINES, AND CAPITALISM
    Vincent, Melvin J.
    [J]. SOCIOLOGY AND SOCIAL RESEARCH, 1933, 17 (06): : 576 - 576
  • [43] Jobs, Machines, and Capitalism
    Knight, F. H.
    [J]. JOURNAL OF POLITICAL ECONOMY, 1932, 40 (04) : 573 - 573
  • [44] Jobs, machines, and capitalism
    Kayden, Eugene M.
    [J]. AMERICAN ECONOMIC REVIEW, 1933, 23 (02): : 320 - 321
  • [45] A Task-Based Greedy Scheduling Algorithm for Minimizing Energy of MapReduce Jobs
    Mostafa Hadadian Nejad Yousefi
    Maziar Goudarzi
    [J]. Journal of Grid Computing, 2018, 16 : 535 - 551
  • [46] Malleable scheduling for flows of jobs and applications to MapReduce
    Viswanath Nagarajan
    Joel Wolf
    Andrey Balmin
    Kirsten Hildrum
    [J]. Journal of Scheduling, 2019, 22 : 393 - 411
  • [47] Scheduling MapReduce Jobs on Identical and Unrelated Processors
    Fotakis, Dimitris
    Milis, Ioannis
    Papadigenopoulos, Orestis
    Vassalos, Vasilis
    Zois, Georgios
    [J]. THEORY OF COMPUTING SYSTEMS, 2020, 64 (05) : 754 - 782
  • [48] A Task-Based Greedy Scheduling Algorithm for Minimizing Energy of MapReduce Jobs
    Yousefi, Mostafa Hadadian Nejad
    Goudarzi, Maziar
    [J]. JOURNAL OF GRID COMPUTING, 2018, 16 (04) : 535 - 551
  • [49] Scheduling MapReduce Jobs on Identical and Unrelated Processors
    Dimitris Fotakis
    Ioannis Milis
    Orestis Papadigenopoulos
    Vasilis Vassalos
    Georgios Zois
    [J]. Theory of Computing Systems, 2020, 64 : 754 - 782
  • [50] Marimba: A Framework for Making MapReduce Jobs Incremental
    Schildgen, Johannes
    Joerg, Thomas
    Hoffmann, Manuel
    Dessloch, Stefan
    [J]. 2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 128 - 135