Inverted Indexing In Big Data Using Hadoop Multiple Node Cluster

被引:0
|
作者
Velusamy, Kaushik [1 ]
Vijayaraju, Nivetha [1 ]
Venkitaramanan, Deepthi [1 ]
Suresh, Greeshma [1 ]
Madhu, Divya [2 ]
机构
[1] Amrita Univ, Dept CSE, Coimbatore, Tamil Nadu, India
[2] Amrita Univ, Dept IT, Coimbatore, Tamil Nadu, India
关键词
Hadoop; Big data; inverted indexing; data structure;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Inverted Indexing is an efficient, standard data structure, most suited for search operation over an exhaustive set of data. The huge set of data is mostly unstructured and does not fit into traditional database categories. Large scale processing of such data needs a distributed framework such as Hadoop where computational resources could easily be shared and accessed. An implementation of a search engine in Hadoop over millions of Wikipedia documents using an inverted index data structure would be carried out for making search operation more accomplished. Inverted index data structure is used for mapping a word in a file or set of files to their corresponding locations. A hash table is used in this data structure which stores each word as index and their corresponding locations as its values thereby providing easy lookup and retrieval of data making it suitable for search operations.
引用
收藏
页码:156 / 161
页数:6
相关论文
共 50 条
  • [1] Big Data Analysis Using Hadoop Cluster
    Saldhi, Ankita
    Goel, Abhinav
    Yadav, Dipesh
    Saldhi, Ankur
    Saksena, Dhruv
    Indu, S.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 572 - 575
  • [2] CLUSTERING AND INDEXING OF MULTIPLE DOCUMENTS USING FEATURE EXTRACTION THROUGH APACHE HADOOP ON BIG DATA
    Lydia, E. Laxmi
    Moses, G. Jose
    Varadarajan, Vijayakumar
    Nonyelu, Fredi
    Maseleno, Andino
    Perumal, Eswaran
    Shankar, K.
    [J]. MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2020, : 108 - 123
  • [3] Performance Evaluation Of Association Mining In Hadoop Single Node Cluster With Big Data
    Asbern, A.
    Asha, P.
    [J]. 2015 INTERNATIONAL CONFERENCED ON CIRCUITS, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2015), 2015,
  • [4] Mining the Associated Patterns in Big Data Using Hadoop Cluster
    Asha, P.
    Jacob, T. Prem
    Pravin, A.
    Asbern, A.
    [J]. INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 1255 - 1263
  • [5] Automatic document clustering and indexing of multiple documents using KNMF for feature extraction through Hadoop and lucene on big data
    Laxmi Lydia, E.
    Sharmili, N.
    Nguyen, Phong Thanh
    Hashim, Wahidah
    Maseleno, Andino
    [J]. Test Engineering and Management, 2019, 81 (11-12): : 1107 - 1130
  • [6] Hadoop Based Scalable Cluster Deduplication for Big Data
    Liu, Qing
    Fu, Yinjin
    Ni, Guiqiang
    Hou, Rui
    [J]. 2016 IEEE 36TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW 2016), 2016, : 98 - 105
  • [7] The Efficient Implementation of Distributed Indexing with Hadoop for Digital Investigations on Big Data
    Lee, Taerim
    Lee, Hyejoo
    Rhee, Kyung-Hyune
    Shin, Sang Uk
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2014, 11 (03) : 1037 - 1054
  • [8] A Novel Node Management in Hadoop Cluster by Using DNA
    Balaraju, J.
    Rao, P. V. R. D. Prasada
    [J]. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY PROJECT MANAGEMENT, 2021, 12 (04) : 38 - 46
  • [9] Big Data Compression using SPIHT in Hadoop
    Jati, Grafika
    Kusuma, Ilham
    Hilman, M. H.
    Jatmiko, Wisnu
    [J]. 2016 INTERNATIONAL WORKSHOP ON BIG DATA AND INFORMATION SECURITY (IWBIS), 2016, : 133 - 137
  • [10] Security framework using Hadoop for Big Data
    Johri, Prashant
    Kumar, Arun
    Das, Sanjoy
    Arora, Sanchita
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 268 - 272