Efficient Random Data Accessing in MapReduce

被引:0
|
作者
Mittal, Mamta [1 ]
Singh, Hari [2 ]
Paliwal, K. K. [2 ]
Goyal, Lalit Mohan [3 ]
机构
[1] GB Pant Govt Engn Coll, New Delhi, India
[2] Panipat Inst Engn & Technol, Panipat, Haryana, India
[3] Bharati Vidyapeeths Coll Engn, New Delhi, India
关键词
Hadoop; MapReduce; HDFS; B-Tree; Index; FRAMEWORK;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The voluminous data can not be handled using traditional serial programming methods. It needs to be dealt effectively using parallel programming methods in a distributed environment. Emerging technologies for parallel processing has been changing the concept of programming, storage and operating system in distributed environment. Grid Computing and MapReduce technologies have been proven very handy in processing huge volume of simple and multi-dimensional data. The MapReduce in the Hadoop provides an abstract environment for parallel processing of jobs. The framework is well-known for its data analysis capability. However, it is efficient for sequential read and writes. It does not show good performance for random read and writes. It is so because the Hadoop is based on the key-value storage concept and does not work on any indexed dataset. A lot of work has been done to improve the performance of the Hadoop. Indexing input dataset in the Hadoop is one such area. In this paper, a B-Tree index construction process in traditional programming environment is described and a conceptual idea of realizing it in the MapReduce-Hadoop is presented.
引用
收藏
页码:552 / 556
页数:5
相关论文
共 50 条
  • [1] Parallel Accessing Massive NetCDF Data Based on MapReduce
    Zhao, Hui
    Ai, SiYun
    Lv, ZhenHua
    Li, Bo
    [J]. WEB INFORMATION SYSTEMS AND MINING, 2010, 6318 : 425 - +
  • [2] Efficient Way of Searching Data in MapReduce Paradigm
    Shah, Gita
    Annappa
    Shet, K. C.
    [J]. 2014 INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2014, : 305 - 310
  • [3] Efficient Big Data Processing in Hadoop MapReduce
    Dittrich, Jens
    Quiane-Ruiz, Jorge-Arnulfo
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015
  • [4] Virtual Shuffling for Efficient Data Movement in MapReduce
    Yu, Weikuan
    Wang, Yandong
    Que, Xinyu
    Xu, Cong
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2015, 64 (02) : 556 - 568
  • [5] A Demonstration of SpatialHadoop: An Efficient MapReduce Framework for Spatial Data
    Eldawy, Ahmed
    Mokbel, Mohamed F.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (12): : 1230 - 1233
  • [6] An Efficient MapReduce Cube Algorithm for Varied Data Distributions
    Milo, Tova
    Altshuler, Eyal
    [J]. SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 1151 - 1165
  • [7] On the use of MapReduce for imbalanced big data using Random Forest
    del Rio, Sara
    Lopez, Victoria
    Manuel Benitez, Jose
    Herrera, Francisco
    [J]. INFORMATION SCIENCES, 2014, 285 : 112 - 137
  • [8] An Efficient Similarity Search in Large Data Collections with MapReduce
    Trong Nhan Phan
    Kueng, Josef
    Tran Khanh Dang
    [J]. FUTURE DATA AND SECURITY ENGINEERING, FDSE 2014, 2014, 8860 : 44 - 57
  • [9] MapReduce Distributed Highly Random Fuzzy Forest for Noisy Big Data
    Mustafic, Faruk
    Xiong, Ning
    Herera, Francisco
    Gallego, Sergio Ramrez
    [J]. 2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017, : 560 - 567
  • [10] Hybrid storage architecture and efficient MapReduce processing for unstructured data
    Lu, Weiming
    Wang, Yaoguang
    Jiang, Jingyuan
    Liu, Jian
    Shen, Yapeng
    Wei, Baogang
    [J]. PARALLEL COMPUTING, 2017, 69 : 63 - 77