Modeling and Optimizing Large-Scale Wide-Area Data Transfers

被引:11
|
作者
Kettimuthu, Rajkumar [1 ,2 ]
Vardoyan, Gayane [1 ]
Agrawal, Gagan [2 ]
Sadayappan, P. [2 ]
机构
[1] Argonne Natl Lab, Math & Comp Sci Div, Argonne, IL 60439 USA
[2] Ohio State Univ, Comp Sci & Engn, Columbus, OH 43210 USA
关键词
wide-area data transfer; GridFTP; modeling data transfer; BANDWIDTH ALLOCATION;
D O I
10.1109/CCGrid.2014.114
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data generated by experimental, simulation, and observational science is growing exponentially. The resulting datasets are often transported over wide-area networks for storage, analysis, or visualization. Network bandwidth, which is not increasing at the same rate as dataset sizes, is becoming a key obstacle to data-driven sciences. In this paper, we focus on how bandwidth allocation can be controlled at the level of a protocol such as GridFTP, in view of goals such as maintaining certain priorities or performing scheduling with specified objectives. In particular, we explore how GridFTP transfer performance can be controlled by using parallelism and concurrency. We find that concurrency turns out to be a more powerful control knob than is parallelism. For a source where most bandwidth is consumed by transfers to a small number of other destinations, we build a model for each destination's achieved throughput in terms of its concurrency and total concurrency (over GridFTP transfers) to other major destinations. We then enhance this model by including an indicator of the time-varying external load, using multiple ways to measure this external load. We study the effectiveness of the proposed models in controlling the bandwidth allocation. After evaluating the numerous combinations of models and methods of measuring external load, we narrow in on the four best-performing ones, based on both their validation results and their applicability. After extensive testing of these four approaches, we find that they can obtain desired bandwidth allocations with a mean(median) error rate of 19.8%(13.8%), with 38% of the errors in our benchmark tests being less than 10% and 54% of them being less than 15%.
引用
收藏
页码:196 / 205
页数:10
相关论文
共 50 条
  • [41] Optimizing data stream processing for large-scale applications
    Cappellari, Paolo
    Roantree, Mark
    Chun, Soon Ae
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 2018, 48 (09): : 1607 - 1641
  • [42] Differentiated Scheduling of Response-Critical and Best-Effort Wide-Area Data Transfers
    Kettimuthu, Rajkumar
    Agrawal, Gagan
    Sadayappan, P.
    Foster, Ian
    [J]. 2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2016), 2016, : 1113 - 1122
  • [43] A Study on Designing Autonomous Decentralized Method of User-Aware Resource Assignment in Large-Scale and Wide-Area Networks
    Kashimoto, Toshitaka
    Toyoda, Fumiya
    Sakumoto, Yusuke
    [J]. ADVANCES IN INTELLIGENT NETWORKING AND COLLABORATIVE SYSTEMS (INCOS-2021), 2022, 312 : 169 - 182
  • [44] Modeling and Simulation of Dynamic Communication Latency and Data Aggregation for Wide-Area Applications
    Cui, Yinan
    Kavasseri, Rajesh G.
    Chaudhuri, Nilanjan Ray
    [J]. 2016 WORKSHOP ON MODELING AND SIMULATION OF CYBER-PHYSICAL ENERGY SYSTEMS (MSCPES), 2016,
  • [45] Measurements and Analytics of Wide-Area File Transfers Over Dedicated Connections
    Rao, Nageswara S. V.
    Liu, Qiang
    Sen, Satyabrata
    Liu, Zhengchun
    Kettimuthu, Raj
    Foster, Ian
    [J]. ICDCN '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING AND NETWORKING, 2019, : 183 - 192
  • [46] Sustained Wide-Area TCP Memory Transfers Over Dedicated Connections
    Rao, Nageswara S. V.
    Towsley, Don
    Vardoyan, Gayane
    Settlemyer, Bradley W.
    Foster, Ian T.
    Kettimuthu, Raj
    [J]. 2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 1603 - 1606
  • [47] Topic modeling for large-scale text data
    Xi-ming Li
    Ji-hong Ouyang
    You Lu
    [J]. Frontiers of Information Technology & Electronic Engineering, 2015, 16 : 457 - 465
  • [48] Topic modeling for large-scale text data
    Li, Xi-ming
    Ouyang, Ji-hong
    Lu, You
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (06) : 457 - 465
  • [49] Siphon: A High-Performance Substrate for Inter-Datacenter Transfers in Wide-Area Data Analytics
    Liu, Shuhao
    Chen, Li
    Li, Baochun
    [J]. PROCEEDINGS OF THE 2017 SYMPOSIUM ON CLOUD COMPUTING (SOCC '17), 2017, : 646 - 646
  • [50] DATA HIGHWAYS FOR WIDE-AREA PROCESS COMPUTING
    HOLDEN, DG
    [J]. CHEMICAL ENGINEERING, 1984, 91 (10) : 73 - &