Efficient Random Data Accessing in MapReduce

被引：0

作者：

Mittal, Mamta ^{[1
]}

Singh, Hari ^{[2
]}

Paliwal, K. K. ^{[2
]}

Goyal, Lalit Mohan ^{[3
]}

机构：

[1] GB Pant Govt Engn Coll, New Delhi, India

[2] Panipat Inst Engn & Technol, Panipat, Haryana, India

[3] Bharati Vidyapeeths Coll Engn, New Delhi, India

来源：

2017 INTERNATIONAL CONFERENCE ON INFOCOM TECHNOLOGIES AND UNMANNED SYSTEMS (TRENDS AND FUTURE DIRECTIONS) (ICTUS) | 2017年

关键词：

Hadoop; MapReduce; HDFS; B-Tree; Index; FRAMEWORK;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The voluminous data can not be handled using traditional serial programming methods. It needs to be dealt effectively using parallel programming methods in a distributed environment. Emerging technologies for parallel processing has been changing the concept of programming, storage and operating system in distributed environment. Grid Computing and MapReduce technologies have been proven very handy in processing huge volume of simple and multi-dimensional data. The MapReduce in the Hadoop provides an abstract environment for parallel processing of jobs. The framework is well-known for its data analysis capability. However, it is efficient for sequential read and writes. It does not show good performance for random read and writes. It is so because the Hadoop is based on the key-value storage concept and does not work on any indexed dataset. A lot of work has been done to improve the performance of the Hadoop. Indexing input dataset in the Hadoop is one such area. In this paper, a B-Tree index construction process in traditional programming environment is described and a conceptual idea of realizing it in the MapReduce-Hadoop is presented.

引用

页码：552 / 556

页数：5

共 50 条

[21] streamingRPHash: Random Projection Clustering of High-Dimensional Data in a MapReduce Framework
Franklin, Jacob
Wenke, Samuel
Quasem, Sadiq
Carraher, Lee A.
Wilsey, Philip A.
2016 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2016, : 168 - 169
[22] Efficient Snapshot KNN Join Processing for Large Data Using MapReduce
Hu, Yupeng
Yang, Chong
Ji, Cun
Xu, Yang
Li, Xueqing
2016 IEEE 22ND INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2016, : 713 - 720
[23] A Generalized MapReduce Approach for Efficient mining of Large data Sets in the GRID
Roehm, Matthias
Grabert, Matthias
Schweiggert, Franz
PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, GRIDS, AND VIRTUALIZATION (CLOUD COMPUTING 2010), 2010, : 14 - 19
[24] Efficient Alignment of Next Generation Sequencing Data Using MapReduce on the Cloud
AlSaad, Rawan
Malluhi, Qutaibah
Abouelhoda, Mohamed
2012 CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE (CIBEC), 2012, : 18 - 22
[25] CSRA: An Efficient Resource Allocation Algorithm in MapReduce Considering Data Skewness
Qi, Ling
Tang, Zhuo
Qin, Yunchuan
Ye, Yu
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2015, 2015, 9403 : 651 - 662
[26] Efficient Querying Distributed Big-XML Data using MapReduce
Song Kunfang
Hongwei Lu
INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2016, 8 (03) : 70 - 79
[27] Efficient MapReduce Kernel k-Means for Big Data Clustering
Tsapanos, Nikolaos
Tefas, Anastasios
Nikolaidis, Nikolaos
Pitas, Ioannis
9TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2016), 2016,
[28] Efficient Distributed Density Peaks for Clustering Large Data Sets in MapReduce
Zhang, Yanfeng
Chen, Shimin
Yu, Ge
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (12) : 3218 - 3230
[29] Utilizing the Buckshot Algorithm for Efficient Big Data Clustering in the MapReduce Model
Gerakidis, Sergios
Mamalis, Basilis
PROCEEDINGS OF THE 23RD PAN-HELLENIC CONFERENCE OF INFORMATICS (PCI 2019), 2019, : 112 - 117
[30] Efficient finer-grained incremental processing with MapReduce for big data
Zhang, Liang
Feng, Yuanyuan
Shen, Peiyi
Zhu, Guangming
Wei, Wei
Song, Juan
Shah, Syed Afaq Ali
Bennamoun, Mohammed
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 80 : 102 - 111

← 1 2 3 4 5 →