Research on Small File Processing Technology Based on HDFS

被引:0
|
作者
Gu, Rui
机构
关键词
HDFS; cloud storage; small files; file merge; insert;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
With the rapid development of the Internet and the rapid growth of Internet users, the Internet data is also a sharp expansion. The emergence of cloud computing is a good solution to the large data computing and storage problems, massive data storage and analysis has become a very popular research field. HDFS uses a single NameNode to manage the metadata of the entire system, and stores metadata in memory in order to improve access efficiency, but when the system stores a large number of small files, it generates a lot of metadata, occupies larger NameNode memory. In addition, a large number of small file access need to frequently send a request to the NameNode, resulting in the NameNode overload. In view of this problem, this paper analyzes some of the previous research and improvement programs, and on this basis to do a corresponding improvement. On the basis of the original distributed file system, an independent small file processing module was added. The small file processing module merged the small files, created the index of the file, and passed the file cache to HDFS for data processing.
引用
收藏
页码:286 / 289
页数:4
相关论文
共 50 条
  • [31] SingleMapReduce: A MapReduce programming model outputting single HDFS file
    Chen, J.-R. (chenjirongdh@163.com), 1600, South China University of Technology (42):
  • [32] SFS: A Massive small file processing middleware in Hadoop
    Huo, Yonghua
    Wang, Zhihao
    Zeng, XiaoXiao
    Yang, Yang
    Li, Wenjing
    Cheng, Zhong
    2016 18TH ASIA-PACIFIC NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM (APNOMS), 2016,
  • [33] Energy-efficient algorithms for distributed file system HDFS
    Liao, Bin
    Yu, Jiong
    Zhang, Tao
    Yang, Xing-Yao
    Jisuanji Xuebao/Chinese Journal of Computers, 2013, 36 (05): : 1047 - 1064
  • [34] An Optimizational scheme of dealing with massive small files based on HDFS
    Du Yongsheng
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENT COMMUNICATION, 2015, 16 : 99 - 102
  • [35] Research on Small Sample Rolling Bearing Fault Diagnosis Method Based on Mixed Signal Processing Technology
    Yu, Peibo
    Zhang, Jianjie
    Zhang, Baobao
    Cao, Jianhui
    Peng, Yihang
    SYMMETRY-BASEL, 2024, 16 (09):
  • [36] Research of the Virtual Prototyping Technology Based on Laser Processing
    Wang, Zhijian
    Shang, Xiaofeng
    Wang, Xiaoyan
    E-ENGINEERING & DIGITAL ENTERPRISE TECHNOLOGY VII, PTS 1 AND 2, 2009, 16-19 : 871 - 875
  • [37] Optimized storage strategy research of HDFS based on vandermonde code
    Song, Bao-Yan
    Wang, Jun-Lu
    Wang, Yan
    Jisuanji Xuebao/Chinese Journal of Computers, 2015, 38 (09): : 1825 - 1837
  • [38] Distributed OAIS-Based Digital Preservation System with HDFS Technology
    Voinov, Nikita
    Drobintsev, Pavel
    Kotlyarov, Vsevolod
    Nikiforov, Igor
    PROCEEDINGS OF THE 20TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT 2017), 2017, : 491 - 497
  • [39] Accessing medical image file with co-allocation HDFS in cloud
    Yang, Chao-Tung
    Shih, Wen-Chung
    Chen, Lung-Teng
    Kuo, Cheng-Ta
    Jiang, Fuu-Cheng
    Leu, Fang-Yie
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2015, 43-44 : 61 - 73
  • [40] An optimized approach for storing small files on HDFS based on dynamic queue
    Jing, Weipeng
    Tong, Danyu
    2016 INTERNATIONAL CONFERENCE ON IDENTIFICATION, INFORMATION AND KNOWLEDGE IN THE INTERNET OF THINGS (IIKI), 2016, : 173 - 178