An Efficient Replicated System for the Metadata of HDFS

被引:0
|
作者
Wang, Zhanye [1 ]
Xu, Tao [1 ]
Wang, Dongsheng [2 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Res Inst Informat Technol, Beijing 100084, Peoples R China
关键词
HDFS; namenode; metadata; availability; replication; NCluster;
D O I
10.14257/ijgdc.2016.9.5.16
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Hadoop HDFS is an open source project from Apache Software Foundation for scalable, distributed computing and data storage. HDFS has become a critical component in today's cloud computing environment and a wide range of applications built on top of it. However, the initial design of HDFS has introduced a single-point-of-failure, since HDFS contains only one active namenode, if this namenode experiences software or hardware failures, the whole HDFS cluster is unusable, this is a reason why people are reluctant to deploy HDFS for an application whose requirement is high availability. In this paper, we present a solution to enable the high availability for HDFS's namenode through efficient metadata replication. Our solution has 3 major advantages than existing ones: We utilize multiple active namenodes, instead of one, to build a cluster to serve requests of metadata simultaneously; We implement a pub/sub system to handle the metadata replication process across these active namonodes efficiently; We also propose a novel replication algorithm to deal with the network delay when the namonodes are deployed in different areas. Based on the solution we build a prototype called NCluster and integrate it with HDFS. We evaluate NCluster to exhibit its feasibility and effectiveness. The experimental results show that our solution performs well with low replication cost, good throughput and scalability.
引用
收藏
页码:175 / 190
页数:16
相关论文
共 50 条
  • [1] A Virtual Shared Metadata Storage for HDFS
    Zhou, Jiang
    Chen, Yong
    Gu, Xiaoyan
    Wang, Weiping
    Meng, Dan
    [J]. PROCEEDINGS OF THE 2015 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE AND STORAGE (NAS), 2015, : 265 - 274
  • [2] HDFS distributed metadata management research
    Xiong, An-ping
    Ma, Jin-yong
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND ENGINEERING INNOVATION, 2015, 12 : 956 - 961
  • [3] Classification based Metadata Management for HDFS
    Chandrasekar, Ashok
    Chandrasekar, Karthik
    Ramasatagopan, Harini
    Rafica, A. R.
    Balasubramaniyan, Jagadeesh
    [J]. 2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 1021 - 1026
  • [4] A Metadata Management Mechanism Based on HDFS
    Chen, Xiaofeng
    Lou, Yuansheng
    Hu, Dongmei
    [J]. Applied Decisions in Area of Mechanical Engineering and Industrial Manufacturing, 2014, 577 : 1026 - 1029
  • [5] MRFS: A Distributed Files System with Geo-replicated Metadata
    Yu, Jiongyu
    Wu, Weigang
    Yang, Di
    Huang, Ning
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2014, PT II, 2014, 8631 : 273 - 285
  • [6] Clover: A distributed file system of expandable metadata service derived from HDFS
    Wang, Youwei
    Zhou, Jiang
    Ma, Can
    Wang, Weiping
    Meng, Dan
    Kei, Jason
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2012, : 126 - 134
  • [7] PARTITIONER: A Distributed HDFS Metadata Server Cluster
    Xue, Ruini
    Ao, Lixiang
    Gao, Shengli
    Guan, Zhongyang
    Lian, Lupeng
    [J]. 2014 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2014, : 167 - 174
  • [8] Improving Metadata Management for Small Files in HDFS
    Mackey, Grant
    Sehrish, Saba
    Wang, Jun
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING AND WORKSHOPS, 2009, : 621 - 624
  • [9] A Novel and Efficient De-duplication System For HDFS
    Ranjitha, S.
    Sudhakar, P.
    Seetharaman, K. S.
    [J]. 2ND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING, COMMUNICATION & CONVERGENCE, ICCC 2016, 2016, 92 : 498 - 505
  • [10] Scaling HDFS with a Strongly Consistent Relational Model for Metadata
    Hakimzadeh, Kamal
    Sajjad, Hooman Peiro
    Dowling, Jim
    [J]. DISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS (DAIS 2014), 2014, 8460 : 38 - 51