New Data Placement Strategy in the HADOOP Framework

被引:0
|
作者
Elomari, Akram [1 ]
Hassouni, Larbi [1 ]
Maizate, Abderrahim [1 ]
机构
[1] Univ Hassan 2, RITM ESTC, CED ENSEM, Casablanca, Morocco
关键词
Big data; data storage; Hadoop; DFS; HDFS; data striping; chunks; placement strategy; performance optimization;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Today, the data quantities generated and exchanged between information systems continues to increase. Storing and exploiting such quantities require can't be done without bigdata systems with mechanisms capable of meeting technological challenges commonly grouped under the four Vs (Volume, Velocity, Variety and Veracity). These technologies include mainly the Distributed File System (DFS). Like Hadoop, which is based on HDFS, the main Big Data systems use a data distributed storage where a subsystem is responsible for subdividing data (data striping) and replicating it on a network of nodes called Grid. In the typical case of Hadoop, a Grid generally consists of many nodes, grouped in multiple Racks. The logic of distributing the stored data through the Grid respects a simple strategy that guarantees the durability of the data and a certain speed of writing. This strategy does not take into consideration neither the technical characteristics of nodes, nor the number of requests on the data, which means a considerable loss in processing capacity of the grid. In this work we proposed a new placement strategy based on exploitation analysis of new information integrated into the HDFS metadata model. A significant 20% improvement in overall processing time was reached through the simulations we conducted on Hadoop.
引用
收藏
页码:676 / 684
页数:9
相关论文
共 50 条
  • [1] An improved data placement strategy for hadoop
    Lin, Wei-Wei
    [J]. Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2012, 40 (01): : 152 - 158
  • [2] SDWP: A New Data Placement Strategy for Distributed Big Data Warehouses in Hadoop
    Ramdane, Yassine
    Kabachi, Nadia
    Boussaid, Omar
    Bentayeb, Fadila
    [J]. BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2019, 2019, 11708 : 189 - 205
  • [3] An improved data placement strategy in a heterogeneous Hadoop cluster
    Zhao, Wentao
    Meng, Lingjun
    Sun, Jiangfeng
    Ding, Yang
    Zhao, Haohao
    Wang, Lina
    [J]. Open Cybernetics and Systemics Journal, 2014, 8 (01): : 957 - 963
  • [4] On a Dynamic Data Placement Strategy for Heterogeneous Hadoop Clusters
    Liu, Yang
    Wu, Chase Q.
    Wang, Meng
    Hou, Aiqin
    Wang, Yongqiang
    [J]. 2018 INTERNATIONAL SYMPOSIUM ON NETWORKS, COMPUTERS AND COMMUNICATIONS (ISNCC 2018), 2018,
  • [5] A Dynamic Data Placement Strategy for Hadoop in Heterogeneous Environments
    Lee, Chia-Wei
    Hsieh, Kuang-Yu
    Hsieh, Sun-Yuan
    Hsiao, Hung-Chang
    [J]. BIG DATA RESEARCH, 2014, 1 : 14 - 22
  • [6] An Improved data placement strategy in a heterogeneous hadoop cluster
    Zhao, Wentao
    Meng, Lingjun
    Sun, Jiangfeng
    Ding, Yang
    Zhao, Haohao
    Wang, Lina
    [J]. Open Cybernetics and Systemics Journal, 2015, 9 (01): : 792 - 798
  • [7] Enhanced Bond Energy Algorithm for Data Placement in Hadoop Framework
    Sridevi, S.
    Reshma, J. G.
    Pavithradevi, E.
    Dhivya, S.
    Uthariaraj, V. Rhymend
    [J]. 2018 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2018, : 208 - 215
  • [8] CORE - An optimal data placement strategy in Hadoop for data intensive applications based on cohesion relation
    Vengadeswaran
    Balasundaram
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2019, 34 (01): : 47 - 60
  • [9] Improvising block placement policy in Hadoop framework
    Jena, Bibhudutta
    Kumar, Pradeep
    Kanaujia
    Rautaray, Siddharth
    Pandey, Manjusha
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI), 2017,
  • [10] Optimizing data placement in heterogeneous Hadoop clusters
    Runqun Xiong
    Junzhou Luo
    Fang Dong
    [J]. Cluster Computing, 2015, 18 : 1465 - 1480