HBA: Distributed metadata management for large cluster-based storage systems

被引:49
|
作者
Zhu, Yifeng [1 ]
Jiang, Hong [2 ]
Wang, Jun [3 ]
Xian, Feng [2 ]
机构
[1] Univ Maine, Dept Elect & Comp Engn, Orono, ME 04473 USA
[2] Univ Nebraska, Dept Comp Sci & Engn, Lincoln, NE 68588 USA
[3] Univ Cent Florida, Sch Elect Engn & Comp Sci, Orlando, FL 32816 USA
基金
美国国家航空航天局; 美国国家科学基金会;
关键词
distributed file systems; file system management; metadata management; Bloom filter;
D O I
10.1109/TPDS.2007.70788
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
An efficient and distributed scheme for file mapping or file lookup is critical in decentralizing metadata management within a group of metadata servers. This paper presents a novel technique called Hierarchical Bloom Filter Arrays (HBA) to map filenames to the metadata servers holding their metadata. Two levels of probabilistic arrays, namely, the Bloom filter arrays with different levels of accuracies, are used on each metadata server. One array, with lower accuracy and representing the distribution of the entire metadata, trades accuracy for significantly reduced memory overhead, whereas the other array, with higher accuracy, caches partial distribution information and exploits the temporal locality of file access patterns. Both arrays are replicated to all metadata servers to support fast local lookups. We evaluate HBA through extensive trace-driven simulations and implementation in Linux. Simulation results show our HBA design to be highly effective and efficient in improving the performance and scalability of file systems in clusters with 1,000 to 10,000 nodes (or superclusters) and with the amount of data in the petabyte scale or higher. Our implementation indicates that HBA can reduce the metadata operation time of a single-metadata-server architecture by a factor of up to 43.9 when the system is configured with 16 metadata servers.
引用
收藏
页码:750 / 763
页数:14
相关论文
共 50 条
  • [21] Metadata Management for Distributed Multimedia Storage System
    Zhan, Ling
    Wan, Jiguang
    Gu, Peng
    [J]. PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, 2008, : 443 - +
  • [22] Cluster-based fast distributed consensus
    Li, Wenjun
    Dai, Huaiyu
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PTS 1-3, PROCEEDINGS, 2007, : 185 - +
  • [23] DROP: Facilitating Distributed Metadata Management in EB-scale Storage Systems
    Xu, Quanqing
    Arumugam, Rajesh Vellore
    Yong, Khai Leong
    Mahadevan, Sridhar
    [J]. 2013 IEEE 29TH SYMPOSIUM ON MASS STORAGE SYSTEMS AND TECHNOLOGIES (MSST), 2013,
  • [24] Prototyping cluster-based distributed applications
    Dvorák, V
    Cejka, R
    [J]. DISTRIBUTED AND PARALLEL SYSTEMS : FROM INSTRUCTION PARALLELISM TO CLUSTER COMPUTING, 2000, 567 : 225 - 228
  • [25] A Cluster-Based Approach for Disturbed, Spatialized, Distributed Information Gathering Systems
    Quang-Anh Nguyen Vu
    Gaudou, Benoit
    Canal, Richard
    Hassas, Salima
    Armetta, Frederic
    [J]. PRINCIPLES AND PRACTICE OF MULTI-AGENT SYSTEMS, 2012, 7057 : 588 - +
  • [26] A cluster-based data storage management protocol in wireless sensor networks
    Liu, Lin
    Yu, Hai-bin
    Liang, Ying
    [J]. 2006 10TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY, VOLS 1 AND 2, PROCEEDINGS, 2006, : 860 - +
  • [27] Bristrita: Namespace and Metadata Distribution in Large-Scale Distributed Cloud Storage Systems
    Dewan, Hrishikesh
    Hansdah, R. C.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD), 2018, : 116 - 124
  • [28] Measurement and analysis of TCP throughput collapse in cluster-based storage systems
    Phanishayee, Arnar
    Krevat, Elie
    Vasudevan, Vijay
    Andersen, David G.
    Ganger, Gregory R.
    Gibson, Garth A.
    Seshan, Srinivasan
    [J]. PROCEEDINGS OF THE 6TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES (FAST '08), 2008, : 175 - 188
  • [29] Extensible block-level storage virtualization in cluster-based systems
    Flouris, Michail D.
    Lachaize, Renaud
    Chasapis, Konstantinos
    Bilas, Angelos
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2010, 70 (08) : 800 - 824
  • [30] QoS resource management for cluster-based image retrieval systems
    Brüning, A
    Geisler, S
    Kao, O
    Hoefer, M
    [J]. PDPTA '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS 1-3, 2005, : 301 - 307