Indexing optimizations on Hadoop

被引:0
|
作者
Bagwari, Neha [1 ]
Kumar, Omesh [1 ]
机构
[1] ABES Engn Coll, Comp Sci Dept, Ghaziabad, India
关键词
Indexing; searching; Hadoop eco system;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Hadoop is an efficient open source framework to store and process the big data. Its component HDFS stores data in distributed manner preserving its consistency and availability while MapReduce is responsible for parallel processing. Hadoop fits best for fault tolerant storage and batch processing but searching is not optimized in Hadoop as it stores data in the form of blocks. It lacks in optimized index design leading to costly searching mechanism. To deal with this various indexing approaches have been proposed as an improvement in Hadoop architecture. In most of the approaches, MapReduce typically generates index at run time to process the data distributed across the cluster. This paper compares the existing indexing approaches and proposes a new index creation and storage technique for Hadoop eco system which will lead to better search results in Hadoop environment.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] Automatic document clustering and indexing of multiple documents using KNMF for feature extraction through Hadoop and lucene on big data
    Laxmi Lydia, E.
    Sharmili, N.
    Nguyen, Phong Thanh
    Hashim, Wahidah
    Maseleno, Andino
    Test Engineering and Management, 2019, 81 (11-12): : 1107 - 1130
  • [22] Optimizations for NTRU
    Hoftstein, J
    Silverman, J
    PUBLIC-KEY CRYPTOGRAPHY AND COMPUTATIONAL NUMBER THEORY, 2001, : 77 - 88
  • [23] JOQR optimizations
    Hasan, W
    OPTIMIZATION OF SQL QUERIES FOR PARALLEL MACHINES, 1996, 1182 : 35 - 57
  • [25] STRING INDEXING - RELATIONAL INDEXING - INTRODUCTION AND INDEXING - FARRADANE,J
    RICHMOND, PA
    COLLEGE & RESEARCH LIBRARIES, 1979, 40 (03): : 293 - 294
  • [26] V-Hadoop: Virtualized Hadoop Using Containers
    Radhakrishnan, Srihari
    Muscedere, Bryan J.
    Daudjee, Khuzaima
    15TH IEEE INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (IEEE NCA 2016), 2016, : 237 - 241
  • [27] Hadoop Characterization
    Alzuru, Icaro
    Long, Kevin
    Li, Tao
    Zimmerman, David
    Gowda, Bhaskar
    2015 IEEE TRUSTCOM/BIGDATASE/ISPA, VOL 2, 2015, : 96 - 103
  • [28] Hadoop变局
    李昊原
    IT经理世界, 2018, (23) : 10 - 12+3
  • [29] Hadoop综述
    李元亨
    邹学玉
    电脑知识与技术, 2018, 14 (09) : 8 - 9+19
  • [30] Beyond Hadoop
    Mone, Gregory
    COMMUNICATIONS OF THE ACM, 2013, 56 (01) : 22 - 24