Distributed file system for rewriting Big Data files using a local-write protocol

被引:1
|
作者
da Silva, Erico Correia [1 ]
Sato, Liria Matsumoto [1 ]
Midorikawa, Edson Toshimi [1 ]
机构
[1] Univ Sao Paulo, Escola Politecn, Sao Paulo, Brazil
关键词
Distributed file systems; Hadoop; Big Data; Distributed lock management;
D O I
10.1109/BigData52589.2021.9671741
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the exponential volume growth of the data available for scientific and commercial use, more and more Big Data technologies are gaining focus and importance. Directly related to the efficiency of these techniques is the distributed file system used for data persistence, generally based on low-cost computer clusters. However, the environments used today for Big Data are based on file systems restricted to the WORM pattern (write once, read many) lacking POSIX compatibility. This work uses distributed lock management techniques to create a file system that allows random writing for both HPC and Big Data tools. A local write protocol is implemented to leverage the use of local copies of the data during the write process. Experiments were carried out to evaluate the performance of the proposed write protocol and the scalability of the developed file system. From the experimental results, it is possible to conclude that the achieved performance and scalability improvements were obtained by eliminating limitations imposed by HDFS and leveraging local writes.
引用
收藏
页码:3646 / 3655
页数:10
相关论文
共 50 条
  • [21] COMMON PROTOCOL FOR DISTRIBUTED NETWORK FILE SYSTEM
    Peniak, P.
    Kallay, F.
    [J]. ADVANCES IN ELECTRICAL AND ELECTRONIC ENGINEERING, 2008, 7 (1-2) : 231 - 234
  • [22] Big File Protocol (BFP): a Traffic Shaping Approach for Efficient Transport of Large Files
    Albanese, Ilijc
    Yazir, Yagiz Onat
    Neville, Stephen W.
    Ganti, Sudhakar
    Darcie, Thomas E.
    [J]. 2014 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE SWITCHING AND ROUTING (HPSR), 2014, : 125 - 130
  • [23] Distributed PACS using Distributed File System with Hierarchical Meta Data Servers
    Hiroyasu, Tomoyuki
    Minamitani, Yoshiyuki
    Miki, Mitsunori
    Yokouchi, Hisatake
    Yoshimi, Masato
    [J]. 2012 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2012, : 5891 - 5894
  • [24] Private Search Over Big Data Leveraging Distributed File System and Parallel Processing
    Selcuk, Ayse
    Orencik, Cengiz
    Savas, Erkay
    [J]. CLOUD COMPUTING 2015: THE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, GRIDS, AND VIRTUALIZATION, 2015, : 116 - 121
  • [25] A KIND OF DISTRIBUTED FILE SYSTEM BASED ON MASSIVE SMALL FILES STORAGE
    Liu, Di
    Kuang, Shi-Jie
    [J]. 2012 INTERNATIONAL CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (LCWAMTIP), 2012, : 394 - 397
  • [26] Reclaiming space from duplicate files in a serverless distributed file system
    Douceur, JR
    Adya, A
    Bolosky, WJ
    Simon, D
    Theimer, M
    [J]. 22ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, PROCEEDINGS, 2002, : 617 - 624
  • [27] Big Data Performance Analysis on a Hadoop Distributed File System Based on Geometric Data Perturbation Technique
    Marichamy, V. Santhana
    Natarajan, V.
    [J]. 2ND INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ADVANCED COMPUTING ICRTAC -DISRUP - TIV INNOVATION , 2019, 2019, 165 : 415 - 420
  • [28] DISTRIBUTED FILES SHARING MANAGEMENT: A FILE SHARING APPLICATION USING DISTRIBUTED COMPUTING CONCEPTS
    Malgaonkar, Saurabh
    Surve, Sakshi
    Hirave, Tejas
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2012, : 106 - 109
  • [29] Performance Evaluation of Read and Write Operations in Hadoop Distributed File System
    Krishna, T. Lakshmi Siva Rama
    Ragunathan, T.
    Battula, Sudheer Kumar
    [J]. 2014 SIXTH INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP), 2014, : 110 - 113
  • [30] Data Structures for Storing File Namespace in Distributed File System
    Long, Luu Hoang
    Choi, Eunmi
    Kim, SangBum
    Kim, Pilsung
    [J]. NCM 2008 : 4TH INTERNATIONAL CONFERENCE ON NETWORKED COMPUTING AND ADVANCED INFORMATION MANAGEMENT, VOL 1, PROCEEDINGS, 2008, : 250 - 255