Efficient Random Data Accessing in MapReduce

被引：0

作者：

Mittal, Mamta ^{[1
]}

Singh, Hari ^{[2
]}

Paliwal, K. K. ^{[2
]}

Goyal, Lalit Mohan ^{[3
]}

机构：

[1] GB Pant Govt Engn Coll, New Delhi, India

[2] Panipat Inst Engn & Technol, Panipat, Haryana, India

[3] Bharati Vidyapeeths Coll Engn, New Delhi, India

来源：

2017 INTERNATIONAL CONFERENCE ON INFOCOM TECHNOLOGIES AND UNMANNED SYSTEMS (TRENDS AND FUTURE DIRECTIONS) (ICTUS) | 2017年

关键词：

Hadoop; MapReduce; HDFS; B-Tree; Index; FRAMEWORK;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The voluminous data can not be handled using traditional serial programming methods. It needs to be dealt effectively using parallel programming methods in a distributed environment. Emerging technologies for parallel processing has been changing the concept of programming, storage and operating system in distributed environment. Grid Computing and MapReduce technologies have been proven very handy in processing huge volume of simple and multi-dimensional data. The MapReduce in the Hadoop provides an abstract environment for parallel processing of jobs. The framework is well-known for its data analysis capability. However, it is efficient for sequential read and writes. It does not show good performance for random read and writes. It is so because the Hadoop is based on the key-value storage concept and does not work on any indexed dataset. A lot of work has been done to improve the performance of the Hadoop. Indexing input dataset in the Hadoop is one such area. In this paper, a B-Tree index construction process in traditional programming environment is described and a conceptual idea of realizing it in the MapReduce-Hadoop is presented.

引用

页码：552 / 556

页数：5

共 50 条

[1] Parallel Accessing Massive NetCDF Data Based on MapReduce
Zhao, Hui
Ai, SiYun
Lv, ZhenHua
Li, Bo
WEB INFORMATION SYSTEMS AND MINING, 2010, 6318 : 425 - +
[2] Efficient Way of Searching Data in MapReduce Paradigm
Shah, Gita
Annappa
Shet, K. C.
2014 INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2014, : 305 - 310
[3] Efficient Big Data Processing in Hadoop MapReduce
Dittrich, Jens
Quiane-Ruiz, Jorge-Arnulfo
PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015
[4] Virtual Shuffling for Efficient Data Movement in MapReduce
Yu, Weikuan
Wang, Yandong
Que, Xinyu
Xu, Cong
IEEE TRANSACTIONS ON COMPUTERS, 2015, 64 (02) : 556 - 568
[5] A Demonstration of SpatialHadoop: An Efficient MapReduce Framework for Spatial Data
Eldawy, Ahmed
Mokbel, Mohamed F.
PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (12): : 1230 - 1233
[6] An Efficient MapReduce Cube Algorithm for Varied Data Distributions
Milo, Tova
Altshuler, Eyal
SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 1151 - 1165
[7] On the use of MapReduce for imbalanced big data using Random Forest
del Rio, Sara
Lopez, Victoria
Manuel Benitez, Jose
Herrera, Francisco
INFORMATION SCIENCES, 2014, 285 : 112 - 137
[8] An Efficient Similarity Search in Large Data Collections with MapReduce
Trong Nhan Phan
Kueng, Josef
Tran Khanh Dang
FUTURE DATA AND SECURITY ENGINEERING, FDSE 2014, 2014, 8860 : 44 - 57
[9] MapReduce Distributed Highly Random Fuzzy Forest for Noisy Big Data
Mustafic, Faruk
Xiong, Ning
Herera, Francisco
Gallego, Sergio Ramrez
2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017, : 560 - 567
[10] Hybrid storage architecture and efficient MapReduce processing for unstructured data
Lu, Weiming
Wang, Yaoguang
Jiang, Jingyuan
Liu, Jian
Shen, Yapeng
Wei, Baogang
PARALLEL COMPUTING, 2017, 69 : 63 - 77

← 1 2 3 4 5 →