Implementing WebGIS on Hadoop: A Case Study of Improving Small File I/O Performance on HDFS

被引:0
|
作者
Liu, Xuhui [1 ,2 ]
Han, Jizhong [1 ]
Zhong, Yunqin [1 ,2 ]
Han, Chengde [1 ]
He, Xubin [3 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
[2] Chinese Acad Sci, Grad Univ, Beijing, Peoples R China
[3] Tennessee Technol Univ, Elect & Comp Engn Dept, Cookeville, TN 38505 USA
关键词
Hadoop; HDFS; WebGIS; Small File I/O Performance;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Hadoop framework has been widely used in various clusters to build large scale, high performance systems. However, Hadoop distributed file system (HDFS) is designed to manage large files and suffers performance penalty while managing a large amount of small files. As a consequence, many web applications, like WebGIS, may not take benefits from Hadoop. In this paper, we propose an approach to optimize I/O performance of small files on HDFS. The basic idea is to combine small files into large ones to reduce the file number and build index for each file. Furthermore, some novel features such as grouping neighboring files and reserving several latest version of data are considered to meet the characteristics of WebGIS access patterns. Preliminary experiment results show that our approach achieves better performance.
引用
收藏
页码:429 / +
页数:3
相关论文
共 50 条
  • [1] File Placing Control for Improving the I/O Performance of Hadoop in Virtualized Environment
    Nakashima, Kenji
    Fujishima, Eita
    Yamaguchi, Saneyasu
    [J]. 2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2016, : 402 - 407
  • [2] Improving Performance of Small-File Accessing in Hadoop
    Vorapongkitipun, Chatuporn
    Nupairoj, Natawut
    [J]. 2014 11TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2014, : 200 - 205
  • [3] A Novel Approach in Improving I/O Performance of Small Meteorological Files on HDFS
    Xue, Sheng-jun
    Pan, Wu-bin
    Fang, Wei
    [J]. MATERIALS AND COMPUTATIONAL MECHANICS, PTS 1-3, 2012, 117-119 : 1759 - +
  • [4] Improving Small File I/O Performance for Massive Digital Archives
    Kim, Hwajung
    Yeom, Heonyoung
    [J]. 2017 IEEE 13TH INTERNATIONAL CONFERENCE ON E-SCIENCE (E-SCIENCE), 2017, : 256 - 265
  • [5] Hadoop I/O Performance Improvement by File Layout Optimization
    Fujishima, Eita
    Nakashima, Kenji
    Yamaguchi, Saneyasu
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (02): : 415 - 427
  • [6] Improving the Performance of HDFS by Reducing I/O Using Adaptable I/O System
    Park, Jung Kyu
    [J]. 2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT), 2016, : 3139 - 3144
  • [7] Performance Study on Indexing and Accessing of Small File in Hadoop Distributed File System
    Rodrigues, Anisha P.
    Fernandes, Roshan
    Vijaya, P.
    Chander, Satish
    [J]. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2021, 20 (04)
  • [8] Adaptable I/O System based I/O Reduction for Improving the Performance of HDFS
    Park, Jung Kyu
    Kim, Jaeho
    Koo, Sungmin
    Baek, Seungjae
    [J]. JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, 2016, 16 (06) : 880 - 888
  • [9] Improving the I/O Performance in the Reduce Phase of Hadoop
    Fujishima, Eita
    Yamaguchi, Saneyasu
    [J]. PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2015, : 82 - 88
  • [10] Benefit of Compression in Hadoop: A Case Study of Improving IO Performance on Hadoop
    Xiang, Li-Hui
    Miao, Li
    Zhang, Da-Fang
    Chen, Feng-Ping
    [J]. PROCEEDINGS OF THE 6TH INTERNATIONAL ASIA CONFERENCE ON INDUSTRIAL ENGINEERING AND MANAGEMENT INNOVATION: CORE THEORY AND APPLICATIONS OF INDUSTRIAL ENGINEERING, VOL 1, 2016, : 879 - 890