A Novel Approach for Efficient Handling of Small Files in HDFS

被引:0
|
作者
Patel, Ankita [1 ]
Mehta, Mayuri A. [1 ]
机构
[1] Sarvajanik Coll Engn & Technol, Dept Comp Engn, Surat, India
关键词
Hadoop; HDFS; small files; file correlation; prefetching;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The Hadoop Distributed File System (HDFS) is a representative cloud storage platform having scalable, reliable and low-cost storage capability. It is designed to handle large files. Hence, it suffers performance penalty while handling a huge number of small files. Further, it does not consider the correlation between the files to provide prefetching mechanism that is useful to improve access efficiency. In this paper, we propose a novel approach to handle small files in HDFS. The proposed approach combines the correlated files into one single file to reduce the metadata storage on Namenode. We integrate the prefetching and caching mechanisms in the proposed approach to improve access efficiency of small files. Moreover, we analyze the performance of the proposed approach considering file sizes in range 32KB-4096KB. The results show that the proposed approach reduces the metadata storage compared to HDFS.
引用
收藏
页码:1258 / 1262
页数:5
相关论文
共 50 条
  • [41] HANDLING IMAGE FILES WITH TIFF
    MEADOW, A
    OFFNER, R
    BUDIANSKY, M
    DR DOBBS JOURNAL, 1988, 13 (05): : 26 - &
  • [42] Economy in Handling Magazine Files
    Gulledge, James R.
    LIBRARY JOURNAL, 1925, 50 (08) : 363 - 363
  • [43] ECONOMY IN HANDLING MAGAZINE FILES
    Thomson, O. R. Howard
    LIBRARY JOURNAL, 1925, 50 (09) : 412 - 413
  • [44] ECONOMY IN HANDLING MAGAZINE FILES
    Gulledge, James R.
    LIBRARY JOURNAL, 1925, 50 (12) : 547 - 547
  • [45] SFSAN Approach for Solving the Problem of Small Files in Hadoop
    El-Sayed, Tharwat
    Badawy, Mohammed
    El-Sayed, Ayman
    PROCEEDINGS OF 2018 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES), 2018, : 135 - 138
  • [46] Efficient access to many small files in a filesystem for grid computing
    Thain, Douglas
    Moretti, Christopher
    2007 8TH IEEE/ACM INTERNATIONAL CONFERENCE ON GRID COMPUTING, 2007, : 74 - 81
  • [47] An Efficient Replicated System for the Metadata of HDFS
    Wang, Zhanye
    Xu, Tao
    Wang, Dongsheng
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2016, 9 (05): : 175 - 190
  • [48] HANDLING OF MEDICAL FILES TO COMPANY DOCTORS
    RIEGER, HJ
    DEUTSCHE MEDIZINISCHE WOCHENSCHRIFT, 1973, 98 (16) : 847 - 847
  • [49] Novel approach in RICH data handling
    Ososkov, GA
    CZECHOSLOVAK JOURNAL OF PHYSICS, 1999, 49 : 145 - 160
  • [50] HANDLING OF FILES ON MINI-COMPUTERS
    ECHALLIER, JF
    PERONNET, F
    GERIN, P
    LAVIRON, A
    METHODS OF INFORMATION IN MEDICINE, 1972, 11 (03) : 190 - +