New Data Placement Strategy in the HADOOP Framework

被引：0

作者：

Elomari, Akram ^{[1
]}

Hassouni, Larbi ^{[1
]}

Maizate, Abderrahim ^{[1
]}

机构：

[1] Univ Hassan 2, RITM ESTC, CED ENSEM, Casablanca, Morocco

来源：

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS | 2021年 / 12卷 / 07期

关键词：

Big data; data storage; Hadoop; DFS; HDFS; data striping; chunks; placement strategy; performance optimization;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Today, the data quantities generated and exchanged between information systems continues to increase. Storing and exploiting such quantities require can't be done without bigdata systems with mechanisms capable of meeting technological challenges commonly grouped under the four Vs (Volume, Velocity, Variety and Veracity). These technologies include mainly the Distributed File System (DFS). Like Hadoop, which is based on HDFS, the main Big Data systems use a data distributed storage where a subsystem is responsible for subdividing data (data striping) and replicating it on a network of nodes called Grid. In the typical case of Hadoop, a Grid generally consists of many nodes, grouped in multiple Racks. The logic of distributing the stored data through the Grid respects a simple strategy that guarantees the durability of the data and a certain speed of writing. This strategy does not take into consideration neither the technical characteristics of nodes, nor the number of requests on the data, which means a considerable loss in processing capacity of the grid. In this work we proposed a new placement strategy based on exploitation analysis of new information integrated into the HDFS metadata model. A significant 20% improvement in overall processing time was reached through the simulations we conducted on Hadoop.

引用

页码：676 / 684

页数：9

共 50 条

[1] An improved data placement strategy for hadoop
Lin, Wei-Wei
[J]. Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2012, 40 (01): : 152 - 158
[2] SDWP: A New Data Placement Strategy for Distributed Big Data Warehouses in Hadoop
Ramdane, Yassine
Kabachi, Nadia
Boussaid, Omar
Bentayeb, Fadila
[J]. BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2019, 2019, 11708 : 189 - 205
[3] An improved data placement strategy in a heterogeneous Hadoop cluster
Zhao, Wentao
Meng, Lingjun
Sun, Jiangfeng
Ding, Yang
Zhao, Haohao
Wang, Lina
[J]. Open Cybernetics and Systemics Journal, 2014, 8 (01): : 957 - 963
[4] On a Dynamic Data Placement Strategy for Heterogeneous Hadoop Clusters
Liu, Yang
Wu, Chase Q.
Wang, Meng
Hou, Aiqin
Wang, Yongqiang
[J]. 2018 INTERNATIONAL SYMPOSIUM ON NETWORKS, COMPUTERS AND COMMUNICATIONS (ISNCC 2018), 2018,
[5] A Dynamic Data Placement Strategy for Hadoop in Heterogeneous Environments
Lee, Chia-Wei
Hsieh, Kuang-Yu
Hsieh, Sun-Yuan
Hsiao, Hung-Chang
[J]. BIG DATA RESEARCH, 2014, 1 : 14 - 22
[6] An Improved data placement strategy in a heterogeneous hadoop cluster
Zhao, Wentao
Meng, Lingjun
Sun, Jiangfeng
Ding, Yang
Zhao, Haohao
Wang, Lina
[J]. Open Cybernetics and Systemics Journal, 2015, 9 (01): : 792 - 798
[7] Enhanced Bond Energy Algorithm for Data Placement in Hadoop Framework
Sridevi, S.
Reshma, J. G.
Pavithradevi, E.
Dhivya, S.
Uthariaraj, V. Rhymend
[J]. 2018 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2018, : 208 - 215
[8] CORE - An optimal data placement strategy in Hadoop for data intensive applications based on cohesion relation
Vengadeswaran
Balasundaram
[J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2019, 34 (01): : 47 - 60
[9] Improvising block placement policy in Hadoop framework
Jena, Bibhudutta
Kumar, Pradeep
Kanaujia
Rautaray, Siddharth
Pandey, Manjusha
[J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI), 2017,
[10] Optimizing data placement in heterogeneous Hadoop clusters
Runqun Xiong
Junzhou Luo
Fang Dong
[J]. Cluster Computing, 2015, 18 : 1465 - 1480

← 1 2 3 4 5 →