Research of Massive Small Files Reading Optimization Based on Parallel Network File System

被引:1
|
作者
Yang, Hongzhang [1 ,2 ]
Zhang, Junwei [1 ]
Zeng, Xiangchao [1 ,2 ]
Dong, Huanqing [1 ]
Xu, Lu [1 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
关键词
Small files; pre-read; pNFS; read optimization;
D O I
10.1109/HPCC-CSS-ICESS.2015.97
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
With the rapid development of cloud computing and big data, there are more and more small files. How to manage those massive small files efficiently and provide low-latency service is becoming a hot topic in Parallel Network File System (pNFS). When reading massive small files in pNFS, because metadata access frequency is fairly high, and disk efficiency is rather low, massive small file access performance is far lower than large file access performance. This paper presents an optimization mechanism for reading small files, including extended read dir delegation, radically metadata pre-read technology and large IO data pre-read technology between small files. These optimizations could significantly reduce the reading access latency and make full use of the client cache. The effectiveness of this optimization is proved with intensive experiments, when reading massive small files, compared with pNFS, the performance of metadata reading is 1959% higher, sequential data reading is 2436% higher, the random data reading performance is 1675% higher, and the overall performance is 1767% higher.
引用
收藏
页码:204 / 212
页数:9
相关论文
共 50 条
  • [21] Research on Persistent Memory File System Optimization
    Zhang, Jianquan
    Feng, Dan
    Liu, Jingning
    Yan, Lei
    Zhang, Zheng
    PROCEEDINGS OF 2017 6TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2017), 2017, : 132 - 142
  • [22] Optimizations based on hints in a parallel file system
    Pérez, MS
    Sánchez, A
    Robles, V
    Peña, JM
    Pérez, F
    COMPUTATIONAL SCIENCE - ICCS 2004, PT 3, PROCEEDINGS, 2004, 3038 : 347 - 354
  • [23] Small File Access Optimization Based on GlusterFS
    Tao, Xie
    Alei, Liang
    2014 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTERNET OF THINGS (CCIOT), 2014, : 101 - 104
  • [24] A Windows-based parallel file system
    Yeh, Lungpin
    Sun, Juei-Ting
    Hung, Sheng-Kai
    Hsu, Yarsun
    HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2007, 4782 : 7 - 18
  • [25] Cache Replacement Strategy Based on User Behaviour Analysis for a Massive Small File Storage System
    Liu, Chenyun
    Ding, Shun
    Ye, Liang
    Chen, Xingyu
    Zhu, Wenhao
    2022 14TH INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2022), 2022, : 178 - 183
  • [26] Optimization Strategies for Data Distribution Schemes in a Parallel File System
    Seidel, Jan
    Berrendorf, Rudolf
    Crngarov, Ace
    Hermanns, Marc-Andre
    PARALLEL COMPUTING: ARCHITECTURES, ALGORITHMS AND APPLICATIONS, 2008, 15 : 425 - +
  • [27] A Novel Indexing Scheme for Efficient Handling of Small Files in Hadoop Distributed File System
    Chandrasekar, S.
    Dakshinamurthy, R.
    Seshakumar, P. G.
    Prabavathy, B.
    Babu, Chitra
    2013 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS, 2013,
  • [28] Effective methods and strategies for massive small files processing based on Hadoop
    Zhang, Z. (zhangzl@swu.edu.cn), 1935, ICIC Express Letters Office (08):
  • [29] Small files access efficiency in hadoop distributed file system a case study performed on British library text files
    Neeta Alange
    P. Vidya Sagar
    Cluster Computing, 2023, 26 : 3381 - 3388
  • [30] Small files access efficiency in hadoop distributed file system a case study performed on British library text files
    Alange, Neeta
    Sagar, P. Vidya
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2023, 26 (06): : 3381 - 3388