Moving metadata from ad hoc files to database tables for robust, highly available, and scalable HDFS

被引:0
|
作者
Heesun Won
Minh Chau Nguyen
Myeong-Seon Gil
Yang-Sae Moon
Kyu-Young Whang
机构
[1] KAIST,School of Computing
[2] ETRI,BigData Intelligence Research Department
[3] Kangwon National University,Department of Computer Science
来源
关键词
Hadoop; HDFS; Advanced HDFS; Distributed file systems; Metadata management;
D O I
暂无
中图分类号
学科分类号
摘要
As a representative large-scale data management technology, Apache Hadoop is an open-source framework for processing a variety of data such as SNS, medical, weather, and IoT data. Hadoop largely consists of HDFS, MapReduce, and YARN. Among them, we focus on improving the HDFS metadata management scheme responsible for storing and managing big data. We note that the current HDFS incurs many problems in system utilization due to its file-based metadata management. To solve these problems, we propose a novel metadata management scheme based on RDBMS for improving the functional aspects of HDFS. Through analysis of the latest HDFS, we first present five problems caused by its metadata management and derive three requirements of robustness, availability, and scalability for resolving these problems. We then design an overall architecture of the advanced HDFS, A-HDFS, which satisfies these requirements. In particular, we define functional modules according to HDFS operations and also present the detailed design strategy for adding or modifying the individual components in the corresponding modules. Finally, through implementation of the proposed A-HDFS, we validate its correctness by experimental evaluation and also show that A-HDFS satisfies all the requirements. The proposed A-HDFS significantly enhances the HDFS metadata management scheme and, as a result, ensures that the entire system improves its stability, availability, and scalability. Thus, we can exploit the improved distributed file system based on A-HDFS for various fields and, in addition, we can expect more applications to be actively developed.
引用
收藏
页码:2657 / 2681
页数:24
相关论文
共 1 条
  • [1] Moving metadata from ad hoc files to database tables for robust, highly available, and scalable HDFS
    Won, Heesun
    Minh Chau Nguyen
    Gil, Myeong-Seon
    Moon, Yang-Sae
    Whang, Kyu-Young
    [J]. JOURNAL OF SUPERCOMPUTING, 2017, 73 (06): : 2657 - 2681