Maximizing data locality in distributed systems

被引:6
|
作者
Chung, Fan
Graharn, Ronald
Bhagwan, Ranjita [1 ]
Savage, Stefan
机构
[1] IBM Corp, TJ Watson Res Ctr, Hawthorne, NY USA
[2] Univ Calif San Diego, Dept Comp Sci & Engn, San Diego, CA 92103 USA
关键词
bin packing; distributed systems; combinatorial algorithms; approximation algorithms;
D O I
10.1016/j.jcss.2006.07.001
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The effectiveness of a distributed system hinges on the manner in which tasks and data are assigned to the underlying system resources. Moreover, today's large-scale distributed systems must accommodate heterogeneity in both the offered load and in the makeup of the available storage and compute capacity. The ideal resource assignment must balance the utilization of the underlying system against the loss of locality incurred when individual tasks or data objects are fragmented among several servers. In this paper we describe this locality-maximizing placement problem and show that an optimal solution is NP-hard. We then describe a polynomial-time algorithm that generates a placement within an additive constant of two from optimal. (C) 2006 Elsevier Inc. All rights reserved.
引用
收藏
页码:1309 / 1316
页数:8
相关论文
共 50 条
  • [1] On Locality in Distributed Storage Systems
    Rawat, Ankit Singh
    Vishwanath, Sriram
    2012 IEEE INFORMATION THEORY WORKSHOP (ITW), 2012, : 497 - 501
  • [2] Proximal optimization for resource allocation in distributed computing systems with data locality
    Goldsztajn, Diego
    Paganini, Fetnando
    Ferragut, Andres
    2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2019, : 773 - 780
  • [3] Spatial Data Locality in Scalable and Fault-tolerant Distributed Spatial Computing Systems
    Werner, Martin
    BIGSPATIAL 2018: PROCEEDINGS OF THE 7TH ACM SIGSPATIAL INTERNATIONAL WORKSHOP ON ANALYTICS FOR BIG GEOSPATIAL DATA (BIGSPATIAL-2018), 2018, : 47 - 56
  • [4] Data Locality via Coordinated Caching for Distributed Processing
    Fischer, M.
    Kuehn, E.
    Giffels, M.
    Jung, C.
    17TH INTERNATIONAL WORKSHOP ON ADVANCED COMPUTING AND ANALYSIS TECHNIQUES IN PHYSICS RESEARCH (ACAT2016), 2016, 762
  • [5] An enhancement of data locality in Hadoop distributed file system
    Reddy, A. Siva Krishna
    Sujatha, Pothula
    Koti, Prasad
    Dhavachelvan, P.
    Amudhavel, J.
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2018, 11 (01): : 123 - 133
  • [6] Type Inference for Locality Analysis of Distributed Data Structures
    Chandra, Satish
    Saraswat, Vijay
    Sarkar, Vivek
    Bodik, Rastislav
    PPOPP'08: PROCEEDINGS OF THE 2008 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2008, : 11 - 22
  • [7] Data Locality in Hadoop Cluster Systems
    Khan, Mukhtaj
    Liu, Yang
    Li, Maozhen
    2014 11TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2014, : 720 - 724
  • [8] Locality optimizations for Jacobi iteration on distributed parallel systems
    Che, YG
    Wang, ZH
    Li, XM
    Yang, LT
    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, PROCEEDINGS, 2004, 3358 : 91 - 104
  • [9] SALoBa: Maximizing Data Locality and Workload Balance for Fast Sequence Alignment on GPUs
    Park, Seongyeon
    Kim, Hajin
    Ahmad, Tanveer
    Ahmed, Nauman
    Al-Ars, Zaid
    Hofstee, H. Peter
    Kim, Youngsok
    Lee, Jinho
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 728 - 738
  • [10] Optimal network locality in distributed virtualized data-centers
    Leblet, Jimmy
    Li, Zhe
    Simon, Gwendal
    Yuan, Di
    COMPUTER COMMUNICATIONS, 2011, 34 (16) : 1968 - 1979