Shard Manager: A Generic Shard Management Framework for Geo-distributed Applications

被引:10
|
作者
Lee, Sangmin [1 ]
Guo, Zhenhua [1 ]
Sunercan, Omer [1 ]
Ying, Jun [1 ]
Kooburat, Thawan [1 ]
Biswal, Suryadeep [1 ]
Chen, Jun [1 ]
Huang, Kun [1 ]
Cheung, Yatpang [1 ]
Zhou, Yiding [1 ]
Veeraraghavan, Kaushik [1 ]
Damani, Biren [1 ]
Ruiz, Pol Mauri [1 ]
Mehta, Vikas [1 ]
Tang, Chunqiang [1 ]
机构
[1] Facebook Inc, Menlo Pk, CA 94025 USA
关键词
shard management; sharding; availability;
D O I
10.1145/3477132.3483546
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Sharding is widely used to scale an application. Despite a decade of effort to build generic sharding frameworks that can be reused across different applications, the extent of their success remains unclear. We attempt to answer a fundamental question: what barriers prevent a sharding framework from getting adopted by the majority of sharded applications? We analyze hundreds of sharded applications at Facebook and identify two major barriers: 1) lack of support for geo-distributed applications, which account for most of Facebook's applications, and 2) inability to maintain application availability during planned events such as software upgrades, which happen similar to 1000 times more frequently than unplanned failures. A sharding framework that does not help applications to address these fundamental challenges is not sufficiently attractive for most applications to adopt it. Other adoption barriers include the burden of supporting many complex applications in a one-size-fit-all sharding framework and the difficulty in supporting sophisticated shard-placement requirements. Theoretically, a constraint solver can handle complex placement requirements, but in practice it is not scalable enough to perform near-realtime shard placement at a global scale. We have overcome these adoption barriers in Facebook's sharding framework called Shard Manager. Currently, Shard Manager is used by hundreds of applications running on over one million machines, which account for about 54% of all sharded applications at Facebook.
引用
收藏
页码:553 / 569
页数:17
相关论文
共 50 条
  • [1] Shard Level Transaction Based Cluster Management for Online Distributed Storage
    Liu, Wei
    Zhao, Jiabao
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTOMATION, MECHANICAL CONTROL AND COMPUTATIONAL ENGINEERING, 2015, 124 : 1375 - 1378
  • [2] Awan: Locality-aware Resource Manager for Geo-distributed Data-intensive Applications
    Jonathan, Albert
    Chandra, Abhishek
    Weissman, Jon
    [J]. PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING (IC2E), 2016, : 32 - 41
  • [3] A Centralized Platform for Geo-Distributed PACS Management
    Bastio Silva, Luis A.
    Pinho, Renato
    Ribeiro, Luis S.
    Costa, Carlos
    Oliveira, Jose Luis
    [J]. JOURNAL OF DIGITAL IMAGING, 2014, 27 (02) : 165 - 173
  • [4] A Centralized Platform for Geo-Distributed PACS Management
    Luís A. Bastião Silva
    Renato Pinho
    Luís S. Ribeiro
    Carlos Costa
    José Luís Oliveira
    [J]. Journal of Digital Imaging, 2014, 27 : 165 - 173
  • [5] Scaling Social Media Applications into Geo-Distributed Clouds
    Wu, Yu
    Wu, Chuan
    Li, Bo
    Zhang, Linquan
    Li, Zongpeng
    Lau, Francis C. M.
    [J]. 2012 PROCEEDINGS IEEE INFOCOM, 2012, : 684 - 692
  • [6] Scaling Social Media Applications Into Geo-Distributed Clouds
    Wu, Yu
    Wu, Chuan
    Li, Bo
    Zhang, Linquan
    Li, Zongpeng
    Lau, Francis C. M.
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2015, 23 (03) : 689 - 702
  • [7] Holistic Management of Sustainable Geo-Distributed Data Centers
    Abbasi, Zahra
    Gupta, Sandeep K. S.
    [J]. 2015 IEEE 22ND INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2015, : 426 - 435
  • [8] A MapReduce Cluster Deployment Optimization Framework with Geo-distributed Data
    Li, Shanshan
    Lu, Qinghua
    Zhang, Weishan
    Zhu, Liming
    [J]. IEEE 12TH INT CONF UBIQUITOUS INTELLIGENCE & COMP/IEEE 12TH INT CONF ADV & TRUSTED COMP/IEEE 15TH INT CONF SCALABLE COMP & COMMUN/IEEE INT CONF CLOUD & BIG DATA COMP/IEEE INT CONF INTERNET PEOPLE AND ASSOCIATED SYMPOSIA/WORKSHOPS, 2015, : 943 - 949
  • [9] A Scheduling Framework for Periodic Tasks in Geo-Distributed Data Centers
    Li, Yan
    Zhang, Hong
    Wang, Yong
    Liu, Xinran
    Zhang, Peng
    [J]. 9TH IEEE INTERNATIONAL SYMPOSIUM ON SERVICE-ORIENTED SYSTEM ENGINEERING (SOSE 2015), 2015, : 247 - 252
  • [10] A Hierarchical Hadoop Framework to Process Geo-Distributed Big Data
    Di Modica, Giuseppe
    Tomarchio, Orazio
    [J]. BIG DATA AND COGNITIVE COMPUTING, 2022, 6 (01)