Modeling job arrivals in a data-intensive Grid

被引:0
|
作者
Li, Hui [1 ]
Muskulus, Michael [2 ]
Wolters, Lex [1 ]
机构
[1] Leiden Univ, LIACS, Niels Bohrweg 1, NL-2333 CA Leiden, Netherlands
[2] Leiden Univ, Inst Math, NL-2333 CA Leiden, Netherlands
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper we present an initial analysis of job arrivals in a production data-intensive Grid and investigate several traffic models to characterize the interarrival time processes. Our analysis focuses on the heavy-tail behavior and autocorrelation structures, and the modeling is carried out at three different levels: Grid, Virtual Organization (VO), and region. A set of m-state Markov modulated Poisson processes (MMPP) is investigated, while Poisson processes and hyperexponential renewal processes are evaluated for comparison studies. We apply the transportation distance metric from dynamical systems theory to further characterize the differences between the data trace and the simulated time series, and estimate errors by bootstrapping. The experimental results show that MMPPs with a certain number of states are successful to a certain extent in simulating the job traffic at different levels, fitting both the interarrival time distribution and the autocorrelation function. However, MMPPs are not able to match the autocorrelations for certain VOs, in which strong deterministic semi-periodic patterns are observed. These patterns are further characterized using different representations. Future work is needed to model both deterministic and stochastic components in order to better capture the correlation structure in the series.
引用
收藏
页码:210 / +
页数:4
相关论文
共 50 条
  • [1] Data-intensive modeling of forest dynamics
    Lienard, Jean F.
    Gravel, Dominique
    Strigul, Nikolay S.
    [J]. ENVIRONMENTAL MODELLING & SOFTWARE, 2015, 67 : 138 - 148
  • [2] Data-intensive analytics for predicting modeling
    Apte, CV
    Hong, SJ
    Natarajan, R
    Pednault, EPD
    Tipu, FA
    Weiss, SM
    [J]. IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2003, 47 (01) : 17 - 23
  • [3] Reducing Job Slowdown Variability for Data-Intensive Workloads
    Ghit, Bogdan
    Epema, Dick
    [J]. 2015 IEEE 23RD INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS 2015), 2015, : 61 - 70
  • [4] Virtual data Grid middleware services for data-intensive science
    Yong Zhao
    Wilde, Michael
    Foster, Ian
    Voeckler, Jens
    Dobson, James
    Gilbert, Eric
    Jordan, Thomas
    Quigg, Elizabeth
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2006, 18 (06): : 595 - 608
  • [5] A Data-Intensive Workflow Scheduling Algorithm for Grid Computing
    Xu, Meng
    Cui, Lizhen
    Wang, Haiyang
    Bi, Yanbing
    Bian, Ji
    [J]. FOURTH CHINAGRID ANNUAL CONFERENCE, PROCEEDINGS, 2009, : 110 - 115
  • [6] MAPFS-Grid:: A flexible architecture for data-intensive grid applications
    Pérez, MS
    Carretero, J
    García, F
    Peña, JM
    Robles, V
    [J]. GRID COMPUTING, 2004, 2970 : 111 - 118
  • [7] Pipelining/Overlapping Data Transfer for Distributed Data-Intensive Job Execution
    Jung, Eun-Sung
    Maheshwari, Ketan
    Kettimuthu, Rajkumar
    [J]. 2013 42ND ANNUAL INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2013, : 791 - 797
  • [8] Conceptual modeling of data-intensive Web applications
    Ceri, S
    Fraternali, P
    Matera, M
    [J]. IEEE INTERNET COMPUTING, 2002, 6 (04) : 20 - 30
  • [9] The virtual data grid: A new model and architecture for data-intensive collaboration
    Foster, I
    [J]. SSDBM 2002: 15TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2003, : 11 - 11
  • [10] Simultaneous scheduling of replication and computation for data-intensive applications on the grid
    Desprez F.
    Vernois A.
    [J]. Journal of Grid Computing, 2006, 4 (1) : 19 - 31