Building a Version Control System in the Hadoop HDFS

被引：0

作者：

Yeh, Tsozen ^{[1
]}

Chien, Tingyu ^{[1
]}

机构：

[1] Fu Jen Catholic Univ, Dept CSIE, New Taipei, Taiwan

来源：

NOMS 2018 - 2018 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM | 2018年

关键词：

cloud computing; Hadoop; HDFS; big data;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The cloud computing has been widely used in recent years. It facilitates the realization of many cutting-edge studies including big data, Internet of Things, and many others. The success of cloud computing cannot be achieved without reliable cloud infrastructure to store and handle the enormous volume of data stored therein. It is common that the contents of individual data files consist of data inserted at different periods of times in the cloud environment. In other words, data files often have chronological versions of contents since their creation. Unfortunately, file contents could be contaminated by bad data or viruses resulting in errors during the course of data processing. It will be easier and faster for users to identify the cause of the error if they could examine and process prior versions of data files in question. Consequently, by keeping versions of data files, cloud systems could help users solve problems more rapidly when errors occur. Hadoop is literally one of the most popular platforms adopted in the community of cloud computing. We designed and implemented an efficient scheme in HDFS, the default file system in Hadoop, to automatically maintain versions of individual data files when changes made to them. As a result, our system can retrieve prior versions of data files and display discrepancy between versions to ameliorate the data management in cloud centers.

引用

页数：5

共 50 条

[1] A Review on Hadoop - HDFS Infrastructure Extensions
Karun, Kala A.
Chitharanjan, K.
[J]. 2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES (ICT 2013), 2013, : 132 - 137
[2] Hadoop, MapReduce and HDFS: A Developers Perspective
Ghazi, Mohd Rehan
Gangodkar, Durgaprasad
[J]. INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND CONVERGENCE (ICCC 2015), 2015, 48 : 45 - 50
[3] SD-HDFS: Secure Deletion in Hadoop Distributed File System
Agrawal, Bikash
Hansen, Raymond
Rong, Chunming
Wiktorski, Tomasz
[J]. 2016 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2016, 2016, : 181 - 189
[4] SecDedoop: Secure Deduplication with Access Control of Big Data in the HDFS/Hadoop Environment
Ramya, P.
Sundar, C.
[J]. BIG DATA, 2020, 8 (02) : 147 - 163
[5] 关于Hadoop中HDFS的研究
刘涌
裴春梅
韩伟
高震宇
[J]. 电脑知识与技术, 2018, 14 (01) : 7 - 8
[6] A RAM triage methodology for Hadoop HDFS forensics
Leimich, Petra
Harrison, Josh
Buchanan, William J.
[J]. DIGITAL INVESTIGATION, 2016, 18 : 96 - 109
[7] Hadoop HDFS和MapReduce架构浅析
郝树魁
[J]. 邮电设计技术, 2012, (07) : 37 - 42
[8] Multicast-based Replication for Hadoop HDFS
Wu, Jiadong
Hong, Bo
[J]. 2015 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2015, : 143 - 148
[9] A DYNAMIC REPLICA STRATEGY BASED ON MARKOV MODEL FOR HADOOP DISTRIBUTED FILE SYSTEM (HDFS)
Qu, Kaiyang
Meng, Luoming
Yang, Yang
[J]. PROCEEDINGS OF 2016 4TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (IEEE CCIS 2016), 2016, : 337 - 342
[10] A comparative between Hadoop MapReduce and Apache Spark on HDFS
Saouabi, Mohamed
Ezzati, Abdellah
[J]. PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND MACHINE LEARNING (IML'17), 2017,

← 1 2 3 4 5 →