Dynamic erasure coding decision for modern block-oriented distributed storage systems

被引:3
|
作者
Ahn, Hoo-Young [1 ]
Lee, Kyong-Ha [2 ]
Lee, Yoon-Joon [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Comp, 291 Daehak Ro, Taejon 305701, South Korea
[2] KISTI, Sci Data Res Ctr, 245 Daehak Ro, Daejeon 305806, South Korea
来源
JOURNAL OF SUPERCOMPUTING | 2016年 / 72卷 / 04期
关键词
Distributed storage system; Storage overhead; Hadoop; HDFS; Data replication; Erasure coding; RAID;
D O I
10.1007/s11227-016-1661-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Modern block-oriented distributed storage systems like Hadoop distributed file system have proliferated in this era of big data and cloud computing. These systems feature block-level replication in which their files are partitioned into equal-sized blocks and multiple copies for each block are then arbitrarily distributed across nodes for fault tolerance and data availability. However, many storage volumes are just wasted only for keeping block copies whose data may not be accessed frequently in the strategy. Therefore, distributed storage systems begin to adopt erasure codes. However, classical parity encoding scheme are hard to be directly applied to the distributed storage systems since block copies are arbitrarily placed across nodes in the systems. We present a novel technique, called DynaEC, to address the issues in modern block-oriented distributed storage systems. DynaEC provides a unique parity encoding algorithm that encodes data blocks arbitrarily distributed across machines to parities and then places the parities guaranteeing fault tolerance. Parity encoding in DynaEC is performed without any change of the original block placement policy in Hadoop distributed file system. This makes DynaEC work seamlessly with Hadoop distributed file system. Finally, during the encoding procedure each data node encodes each own data blocks, not requiring any information about other blocks located in other data nodes. As such, the encoding procedure in DynaEC is fully performed in parallel without any synchronization issue. With extensive experiments, we show that DynaEC saves storage volumes up to the theoretical limit while outperforming previous approaches by multiple orders of magnitude.
引用
收藏
页码:1312 / 1341
页数:30
相关论文
共 50 条
  • [41] Iterative identification of block-oriented nonlinear systems based on biconvex optimization
    Li, Guoqi
    Wen, Changyun
    Zheng, Wei Xing
    Zhao, Guangshe
    SYSTEMS & CONTROL LETTERS, 2015, 79 : 68 - 75
  • [42] Distributed Erasure Coding in Data Centric Storage for Wireless Sensor Networks
    Albano, Michele
    Chessa, Stefano
    ISCC: 2009 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1 AND 2, 2009, : 22 - 27
  • [43] Boosting Multi-Block Repair in Cloud Storage Systems with Wide-Stripe Erasure Coding
    Yu, Qi
    Wang, Lin
    Hu, Yuchong
    Xu, Yumeng
    Feng, Dan
    Fu, Jie
    Zhu, Xia
    Yao, Zhen
    Wei, Wenjia
    2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS, 2023, : 279 - 289
  • [44] Adaptive erasure code based distributed storage systems
    Rai, Brijesh Kumar
    2015 IEEE 14TH CANADIAN WORKSHOP ON INFORMATION THEORY (CWIT), 2015, : 174 - 177
  • [45] Erasure Codes for Cold Data in Distributed Storage Systems
    Yin, Chao
    Xu, Zhiyuan
    Li, Wei
    Li, Tongfang
    Yuan, Sihao
    Liu, Yan
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [46] Compact block-oriented continuous-time dynamic modeling for nonlinear systems under sinusoidal input sequences
    Zhai, DM
    Rollins, DK
    Bhandari, N
    PROCEEDINGS OF THE SIXTH IASTED INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL, 2004, : 295 - 300
  • [47] Nonparametric identification of nonlinearities in block-oriented systems by orthogonal wavelets with compact support
    Hasiewicz, Z
    Pawlak, M
    Sliwiñski, P
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2005, 52 (02) : 427 - 442
  • [48] Identification of block-oriented nonlinear systems starting from linear approximations: A survey
    Schoukens, Maarten
    Tiels, Koen
    AUTOMATICA, 2017, 85 : 272 - 292
  • [49] NONPARAMETRIC RECOVERING NONLINEARITIES IN BLOCK-ORIENTED SYSTEMS WITH THE HELP OF LAGUERRE-POLYNOMIALS
    GREBLICKI, W
    PAWLAK, M
    CONTROL-THEORY AND ADVANCED TECHNOLOGY, 1994, 10 (04): : 771 - 791
  • [50] Taming Computation Skews of Block-Oriented Iterative Scientific Applications in MapReduce Systems
    Yang, Xin
    Li, Min
    Yu, Ze
    Li, Xiaolin
    2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, : 176 - 183