Efficient Random Data Accessing in MapReduce

被引：0

作者：

Mittal, Mamta ^{[1
]}

Singh, Hari ^{[2
]}

Paliwal, K. K. ^{[2
]}

Goyal, Lalit Mohan ^{[3
]}

机构：

[1] GB Pant Govt Engn Coll, New Delhi, India

[2] Panipat Inst Engn & Technol, Panipat, Haryana, India

[3] Bharati Vidyapeeths Coll Engn, New Delhi, India

来源：

2017 INTERNATIONAL CONFERENCE ON INFOCOM TECHNOLOGIES AND UNMANNED SYSTEMS (TRENDS AND FUTURE DIRECTIONS) (ICTUS) | 2017年

关键词：

Hadoop; MapReduce; HDFS; B-Tree; Index; FRAMEWORK;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The voluminous data can not be handled using traditional serial programming methods. It needs to be dealt effectively using parallel programming methods in a distributed environment. Emerging technologies for parallel processing has been changing the concept of programming, storage and operating system in distributed environment. Grid Computing and MapReduce technologies have been proven very handy in processing huge volume of simple and multi-dimensional data. The MapReduce in the Hadoop provides an abstract environment for parallel processing of jobs. The framework is well-known for its data analysis capability. However, it is efficient for sequential read and writes. It does not show good performance for random read and writes. It is so because the Hadoop is based on the key-value storage concept and does not work on any indexed dataset. A lot of work has been done to improve the performance of the Hadoop. Indexing input dataset in the Hadoop is one such area. In this paper, a B-Tree index construction process in traditional programming environment is described and a conceptual idea of realizing it in the MapReduce-Hadoop is presented.

引用

页码：552 / 556

页数：5

共 50 条

[41] Efficient Data Structures for Risk Modelling in Portfolios of Catastrophic Risk Using MapReduce
Rau-Chaplin, Andrew
Yao, Zhimin
Zeh, Norbert
2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2014, 29 : 1557 - 1568
[42] Data pipeline in MapReduce
Zeng, Jiaan
Plale, Beth
2013 IEEE 9TH INTERNATIONAL CONFERENCE ON E-SCIENCE (E-SCIENCE), 2013, : 164 - 171
[43] Random access file service - A service for randomly accessing remote data in grid environment
Wang, Q
Liu, J
Fu, SS
Proceedings of the 11th Joint International Computer Conference, 2005, : 193 - 198
[44] RANDOM ACCESSING FOR LARGE TAPE FILES
KAIMANN, RA
DATA PROCESSING MAGAZINE, 1969, 11 (02): : 22 - &
[45] Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling
Chen, Rong
Chen, Haibo
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2013, 10 (01)
[46] MapReduce-Based Improved Random Forest Model for Massive Educational Data Processing and Classification
Wei Xu
Vinh Truong Hoang
Mobile Networks and Applications, 2021, 26 : 191 - 199
[47] Load balancing in reducers for skewed data in MapReduce systems by using scalable simple random sampling
Elaheh Gavagsaz
Ali Rezaee
Hamid Haj Seyyed Javadi
The Journal of Supercomputing, 2018, 74 : 3415 - 3440
[48] Load balancing in reducers for skewed data in MapReduce systems by using scalable simple random sampling
Gavagsaz, Elaheh
Rezaee, Ali
Javadi, Hamid Haj Seyyed
JOURNAL OF SUPERCOMPUTING, 2018, 74 (07): : 3415 - 3440
[49] MapReduce-Based Improved Random Forest Model for Massive Educational Data Processing and Classification
Xu, Wei
Hoang, Vinh Truong
MOBILE NETWORKS & APPLICATIONS, 2021, 26 (01): : 191 - 199
[50] Implementation of on-process aggregation for Efficient Big Data Processing in Hadoop MapReduce Environment
Pol, Vidya V.
Patil, S. M.
2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 3, 2015, : 445 - 449

← 1 2 3 4 5 →