Indexing optimizations on Hadoop

被引:0
|
作者
Bagwari, Neha [1 ]
Kumar, Omesh [1 ]
机构
[1] ABES Engn Coll, Comp Sci Dept, Ghaziabad, India
关键词
Indexing; searching; Hadoop eco system;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Hadoop is an efficient open source framework to store and process the big data. Its component HDFS stores data in distributed manner preserving its consistency and availability while MapReduce is responsible for parallel processing. Hadoop fits best for fault tolerant storage and batch processing but searching is not optimized in Hadoop as it stores data in the form of blocks. It lacks in optimized index design leading to costly searching mechanism. To deal with this various indexing approaches have been proposed as an improvement in Hadoop architecture. In most of the approaches, MapReduce typically generates index at run time to process the data distributed across the cluster. This paper compares the existing indexing approaches and proposes a new index creation and storage technique for Hadoop eco system which will lead to better search results in Hadoop environment.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Scalable high-dimensional indexing with Hadoop
    Shestakov, Denis
    Moise, Diana
    Gudmundsson, Gylfi
    Amsaleg, Laurent
    2013 11TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI 2013), 2013, : 207 - 212
  • [2] Characterizing Hadoop Applications on Microservers for Performance and Energy Efficiency Optimizations
    Malik, Maria
    Sasan, Avesta
    Joshi, Rajiv
    Rafatirah, Setareh
    Homayoun, Houman
    2016 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE ISPASS 2016, 2016, : 153 - 154
  • [3] Hadoop Workloads Characterization for Performance and Energy Efficiency Optimizations on Microservers
    Malik, Maria
    Neshatpour, Katayoun
    Rafatirad, Setareh
    Homayoun, Houman
    IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2018, 4 (03): : 355 - 368
  • [4] Practical scalable image analysis and indexing using Hadoop
    Hare, Jonathon S.
    Samangooei, Sina
    Lewis, Paul H.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 71 (03) : 1215 - 1248
  • [5] Practical scalable image analysis and indexing using Hadoop
    Jonathon S. Hare
    Sina Samangooei
    Paul H. Lewis
    Multimedia Tools and Applications, 2014, 71 : 1215 - 1248
  • [6] Performance Optimizations for Distributed Real-time Text Indexing
    Narang, Ankur
    Swaminathan, Karthik
    Agrawal, Prashant
    16TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), PROCEEDINGS, 2009, : 398 - 407
  • [7] Content Based Audiobooks Indexing using Apache Hadoop Framework
    Shetty, Sonal
    Sabarad, Akash
    Hebballi, Harish
    Husain, Moula
    Meena, S. M.
    Nagaralli, Shiddu
    PROCEEDING OF THE THIRD INTERNATIONAL SYMPOSIUM ON WOMEN IN COMPUTING AND INFORMATICS (WCI-2015), 2015, : 496 - 501
  • [8] Towards zero-overhead static and adaptive indexing in Hadoop
    Richter, Stefan
    Quiane-Ruiz, Jorge-Arnulfo
    Schuh, Stefan
    Dittrich, Jens
    VLDB JOURNAL, 2014, 23 (03): : 469 - 494
  • [9] Design of Effective Indexing Technique in Hadoop-Based Database
    Shim, Jae-Sung
    Jang, Young-Hwan
    Ju, Yong-Wan
    Park, Seok-Cheon
    ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2018, 474 : 90 - 95
  • [10] An Entity Based RDF Indexing Schema Using Hadoop And HBase
    Abiri, Fateme
    Kahani, Mohsen
    Zarinkalam, Fatane
    2014 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2014, : 68 - 73