Aurora: Adaptive Block Replication in Distributed File Systems

被引：10

作者：

Zhang, Qi ^{[1
]}

Zhang, Sai Qian ^{[1
]}

Leon-Garcia, Alberto ^{[1
]}

Boutaba, Raouf ^{[2
]}

机构：

[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON M5S 1A1, Canada

[2] Univ Waterloo, David R Cheriton Sch Comp Sci, Waterloo, ON N2L 3G1, Canada

来源：

2015 IEEE 35TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS | 2015年

关键词：

D O I：

10.1109/ICDCS.2015.52

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Distributed file systems such as Google File System and Hadoop Distributed File System have been used to store large volumes of data in Cloud data centers. These systems divide data sets in blocks of fixed size and replicate them over multiple machines to achieve both reliability and efficiency. Recent studies have shown that data blocks tend to have a wide disparity in data popularity. In this context, the naive block replication schemes used by these systems often cause an uneven load distribution across machines, which reduces the overall I/O throughput of the system. While many replication algorithms have been proposed, existing solutions have not carefully studied the placement of data blocks that balances the load across machines, while ensuring node and rack-level reliability requirements are satisfied. In this paper, we study the dynamic data replication problem with the goal of balancing machine load while ensuring machine and rack-level reliability requirements are met. We propose several local search algorithms that provide constant approximation guarantees, yet simple and practical for implementation. We further present Aurora, a dynamic block placement mechanism that implements these algorithms in the Hadoop Distributed File System with minimal overhead. Through experiments using workload traces from Yahoo! and Facebook, we show Aurora reduces machine load imbalance by up to 26.9% compared to existing solutions, while satisfying node and rack-level reliability requirements.

引用

页码：442 / 451

页数：10

共 50 条

[31] On program and file assignment for distributed systems
Liu, GQ
Xie, M
Dai, YS
Poh, KL
[J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2004, 19 (01): : 39 - 48
[32] A SURVEY OF DISTRIBUTED FILE-SYSTEMS
SATYANARAYANAN, M
[J]. ANNUAL REVIEW OF COMPUTER SCIENCE, 1989, 4 : 73 - 104
[33] Research on resilient distributed file systems
Li, ZH
Li, WH
Lin, Z
[J]. DCABES 2004, Proceedings, Vols, 1 and 2, 2004, : 430 - 434
[34] Disk striping and block replication algorithms for video file servers
Flynn, R
Tetzlaff, W
[J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, 1996, : 590 - 597
[35] Models and software model checking of a distributed file replication system
Bjorner, Nikolaj
[J]. Formal Methods and Hybrid Real-Time Systems, 2007, 4700 : 1 - 23
[36] File replication and consistency maintenance mechanism in a trusted distributed environment
Manu Vardhan
Dharmender Singh Kushwaha
[J]. CSI Transactions on ICT, 2013, 1 (1) : 29 - 49
[37] Design and experimental evaluation of an adaptive object replication algorithm in distributed network systems
Lin, WJ
Veeravalli, L
[J]. PROCEEDINGS OF THE FOURTH IASTED INTERNATIONAL CONFERENCE ON COMMUNICATIONS, INTERNET, AND INFORMATION TECHNOLOGY, 2005, : 62 - 67
[38] Designing file replication schemes for peer-to-peer file sharing systems
Ni, Jian
Lin, Jie
Harrington, Steven J.
Sharma, Naveen
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, PROCEEDINGS, VOLS 1-13, 2008, : 5609 - +
[39] DIRECTORY REPLICATION IN DISTRIBUTED SYSTEMS
WONG, KC
CORNACCHIO, J
[J]. PROCEEDINGS OF THE FIRST ANNUAL WORKSHOP FOR THE ACM SPECIAL INTEREST GROUP ON FORTH: SIGFORTH 89, 1989, : 123 - 127
[40] A Distributed Cache Framework for Metadata Service of Distributed File Systems
Sun, Yao
Liu, Jie
Ye, Dan
Zhong, Hua
[J]. 2013 19TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2013), 2013, : 51 - 58

← 1 2 3 4 5 →