Fault-tolerant distributed mass storage for LHC computing

被引:0
|
作者
Wiebalck, A [1 ]
Breuer, PT [1 ]
Lindenstruth, V [1 ]
Steinbeck, TM [1 ]
机构
[1] Univ Heidelberg, Kirchhoff Inst Phys, Chair Comp Sci & Comp Engn, Heidelberg, Germany
关键词
D O I
10.1109/CCGRID.2003.1199377
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we present the concept and first prototyping results of a modular fault-tolerant distributed mass storage architecture for large Linux PC clusters as they are deployed by the upcoming particle physics experiments. The device masquerading technique using an Enhanced Network Block Device (ENBD) enables local RAID over remote disks as the key concept of the ClusterRAID system. The block level interface to remote files, partitions or disks provided by the ENBD makes it possible to use the standard Linux software RAID to add fault-tolerance to the system. Preliminary performance measurements indicate that the latency is comparable to a local hard drive. With four disks throughput rates of up to 55MB/s were achieved with first prototypes for a RAIDO setup, and about 40MB/s for a RAID5 setup.
引用
收藏
页码:266 / 273
页数:8
相关论文
共 50 条
  • [1] BIBLIOGRAPHY FOR FAULT-TOLERANT DISTRIBUTED COMPUTING
    COAN, BA
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1990, 448 : 274 - 298
  • [2] Fault-tolerant distributed computing: Evolution and issues
    Kim, K.H.
    [J]. IEEE Distributed Systems Online, 2002, 3 (07):
  • [4] COMMUNICATIONS IN DISTRIBUTED FAULT-TOLERANT COMPUTING SYSTEMS
    MORGANTI, M
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 1986, 6 (1-2) : 213 - 216
  • [5] Cyclic storage for fault-tolerant distributed executions
    Marcelin-Jimenez, Ricardo
    Rajsbaum, Sergio
    Stevens, Brett
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2006, 17 (09) : 1028 - 1036
  • [6] A hybrid and adaptive model for fault-tolerant distributed computing
    Gorender, S
    Macêdo, R
    Raynal, M
    [J]. 2005 INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2005, : 412 - 421
  • [7] Active fault-tolerant system for open distributed computing
    Lanka, Rodrigo
    Oda, Kentaro
    Yoshida, Takaichi
    [J]. AUTONOMIC AND TRUSTED COMPUTING, PROCEEDINGS, 2006, 4158 : 581 - 590
  • [8] Fundamentals of fault-tolerant distributed computing in asynchronous environments
    Gärtner, FC
    [J]. ACM COMPUTING SURVEYS, 1999, 31 (01) : 1 - 26
  • [9] An adaptive programming model for fault-tolerant distributed computing
    Gorender, Sergio
    Macedo, Raimundo Jose de Araujo
    Raynal, Michel
    [J]. IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2007, 4 (01) : 18 - 31
  • [10] A dynamic fault-tolerant model for open distributed computing
    Lanka, Rodrigo
    Oda, Kentaro
    Najima, Horoki
    Yoshida, Takaichi
    [J]. SEVENTEENTH INTERNATIONAL CONFERENCE ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2006, : 25 - +