Efficient Random Data Accessing in MapReduce

被引:0
|
作者
Mittal, Mamta [1 ]
Singh, Hari [2 ]
Paliwal, K. K. [2 ]
Goyal, Lalit Mohan [3 ]
机构
[1] GB Pant Govt Engn Coll, New Delhi, India
[2] Panipat Inst Engn & Technol, Panipat, Haryana, India
[3] Bharati Vidyapeeths Coll Engn, New Delhi, India
关键词
Hadoop; MapReduce; HDFS; B-Tree; Index; FRAMEWORK;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The voluminous data can not be handled using traditional serial programming methods. It needs to be dealt effectively using parallel programming methods in a distributed environment. Emerging technologies for parallel processing has been changing the concept of programming, storage and operating system in distributed environment. Grid Computing and MapReduce technologies have been proven very handy in processing huge volume of simple and multi-dimensional data. The MapReduce in the Hadoop provides an abstract environment for parallel processing of jobs. The framework is well-known for its data analysis capability. However, it is efficient for sequential read and writes. It does not show good performance for random read and writes. It is so because the Hadoop is based on the key-value storage concept and does not work on any indexed dataset. A lot of work has been done to improve the performance of the Hadoop. Indexing input dataset in the Hadoop is one such area. In this paper, a B-Tree index construction process in traditional programming environment is described and a conceptual idea of realizing it in the MapReduce-Hadoop is presented.
引用
收藏
页码:552 / 556
页数:5
相关论文
共 50 条
  • [41] Efficient Data Structures for Risk Modelling in Portfolios of Catastrophic Risk Using MapReduce
    Rau-Chaplin, Andrew
    Yao, Zhimin
    Zeh, Norbert
    2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2014, 29 : 1557 - 1568
  • [42] Data pipeline in MapReduce
    Zeng, Jiaan
    Plale, Beth
    2013 IEEE 9TH INTERNATIONAL CONFERENCE ON E-SCIENCE (E-SCIENCE), 2013, : 164 - 171
  • [43] Random access file service - A service for randomly accessing remote data in grid environment
    Wang, Q
    Liu, J
    Fu, SS
    Proceedings of the 11th Joint International Computer Conference, 2005, : 193 - 198
  • [44] RANDOM ACCESSING FOR LARGE TAPE FILES
    KAIMANN, RA
    DATA PROCESSING MAGAZINE, 1969, 11 (02): : 22 - &
  • [45] Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling
    Chen, Rong
    Chen, Haibo
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2013, 10 (01)
  • [46] MapReduce-Based Improved Random Forest Model for Massive Educational Data Processing and Classification
    Wei Xu
    Vinh Truong Hoang
    Mobile Networks and Applications, 2021, 26 : 191 - 199
  • [47] Load balancing in reducers for skewed data in MapReduce systems by using scalable simple random sampling
    Elaheh Gavagsaz
    Ali Rezaee
    Hamid Haj Seyyed Javadi
    The Journal of Supercomputing, 2018, 74 : 3415 - 3440
  • [48] Load balancing in reducers for skewed data in MapReduce systems by using scalable simple random sampling
    Gavagsaz, Elaheh
    Rezaee, Ali
    Javadi, Hamid Haj Seyyed
    JOURNAL OF SUPERCOMPUTING, 2018, 74 (07): : 3415 - 3440
  • [49] MapReduce-Based Improved Random Forest Model for Massive Educational Data Processing and Classification
    Xu, Wei
    Hoang, Vinh Truong
    MOBILE NETWORKS & APPLICATIONS, 2021, 26 (01): : 191 - 199
  • [50] Implementation of on-process aggregation for Efficient Big Data Processing in Hadoop MapReduce Environment
    Pol, Vidya V.
    Patil, S. M.
    2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 3, 2015, : 445 - 449