Data Adaptively Storing Approach for Hadoop Distributed File System

被引:0
|
作者
Fu, Yingxun [1 ]
Wen, Shilin [1 ]
Ma, Li [1 ]
机构
[1] North China Univ Technol, Coll Comp Sci, Beijing, Peoples R China
关键词
HDFS; Data Access Frequency; Storage Policy; Hot Value;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hadoop distributed file system (HDFS) is an open source framework that has been usually used for cloud storage systems. Due to the arts and craft development and the consideration of price, today's storage system usually contains both fast devices (such as SSDs) and slow devices (like Hard disks). In order to optimize the performance, we should store the frequently accessed data in fast devices, and lay the infrequently used data in slow devices. Focus this problem, the latest HDFS provides different storage policy for classifying the data. However, traditional methods usually first assign all data for the same storage policy, and then adjust the policy by their access frequency, which usually cannot provide high hit rate for fast devices, because the frequently accessed data cannot be labeled when they uploaded. In this paper, we provide a data adaptively storing approach (DASA) for improving the hit rate for fast devices. DASA first predict all file's hot value when they uploaded, and set storage policy by the hot value, in order to assign the frequently accessed data into fast devices. The evaluation results show that DASA provides very good hit rate compared to traditional method, which illustrates DASA gains good performance for HDFS.
引用
收藏
页码:20 / 24
页数:5
相关论文
共 50 条
  • [1] Zput: a speedy data uploading approach for the Hadoop Distributed File System
    Wang, Youwei
    Wang, Weiping
    Ma, Can
    Meng, Dan
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2013,
  • [2] Data Security in Hadoop Distributed File System
    Shetty, Madhvaraj M.
    Manjaiah, D. H.
    [J]. IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGICAL TRENDS IN COMPUTING, COMMUNICATIONS AND ELECTRICAL ENGINEERING (ICETT), 2016,
  • [3] An approach for Big Data Security based on Hadoop Distributed File system
    Mahmoud, Hadeer
    Hegazy, Abdelfatah
    Khafagy, Mohamed H.
    [J]. PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN COMPUTER ENGINEERING (ITCE' 2018), 2018, : 109 - 114
  • [4] Data Structures for Storing File Namespace in Distributed File System
    Long, Luu Hoang
    Choi, Eunmi
    Kim, SangBum
    Kim, Pilsung
    [J]. NCM 2008 : 4TH INTERNATIONAL CONFERENCE ON NETWORKED COMPUTING AND ADVANCED INFORMATION MANAGEMENT, VOL 1, PROCEEDINGS, 2008, : 250 - 255
  • [5] Development of distributed file system for storing weather data
    Sherstnev, V. S.
    Botygin, I. A.
    Zenzin, A. S.
    Sherstneva, A. I.
    Galanova, N. Y.
    [J]. 22ND INTERNATIONAL SYMPOSIUM ON ATMOSPHERIC AND OCEAN OPTICS: ATMOSPHERIC PHYSICS, 2016, 10035
  • [6] An enhancement of data locality in Hadoop distributed file system
    Reddy, A. Siva Krishna
    Sujatha, Pothula
    Koti, Prasad
    Dhavachelvan, P.
    Amudhavel, J.
    [J]. BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2018, 11 (01): : 123 - 133
  • [7] Hadoop Distributed File System for Big data analysis
    Almansouri, Hatim Talal
    Masmoudi, Youssef
    [J]. PROCEEDINGS OF 2019 IEEE 4TH WORLD CONFERENCE ON COMPLEX SYSTEMS (WCCS' 19), 2019, : 257 - 261
  • [8] The Hadoop Distributed File System
    Shvachko, Konstantin
    Kuang, Hairong
    Radia, Sanjay
    Chansler, Robert
    [J]. 2010 IEEE 26TH SYMPOSIUM ON MASS STORAGE SYSTEMS AND TECHNOLOGIES (MSST), 2010,
  • [9] Memory-based Data Storing Technologies on Hadoop Distribution File System
    Song, Aibo
    Zhao, Jinghua
    Tu, Jinlin
    Qian, Xuejiao
    [J]. 2015 THIRD INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA, 2015, : 64 - 68
  • [10] A CKAN Plugin for Data Harvesting to the Hadoop Distributed File System
    Scholz, Robert
    Tcholtchev, Nikolay
    Laemmel, Philipp
    Schieferdecker, Ina
    [J]. CLOSER: PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, 2017, : 19 - 28