Inverted Indexing In Big Data Using Hadoop Multiple Node Cluster

被引:0
|
作者
Velusamy, Kaushik [1 ]
Vijayaraju, Nivetha [1 ]
Venkitaramanan, Deepthi [1 ]
Suresh, Greeshma [1 ]
Madhu, Divya [2 ]
机构
[1] Amrita Univ, Dept CSE, Coimbatore, Tamil Nadu, India
[2] Amrita Univ, Dept IT, Coimbatore, Tamil Nadu, India
关键词
Hadoop; Big data; inverted indexing; data structure;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Inverted Indexing is an efficient, standard data structure, most suited for search operation over an exhaustive set of data. The huge set of data is mostly unstructured and does not fit into traditional database categories. Large scale processing of such data needs a distributed framework such as Hadoop where computational resources could easily be shared and accessed. An implementation of a search engine in Hadoop over millions of Wikipedia documents using an inverted index data structure would be carried out for making search operation more accomplished. Inverted index data structure is used for mapping a word in a file or set of files to their corresponding locations. A hash table is used in this data structure which stores each word as index and their corresponding locations as its values thereby providing easy lookup and retrieval of data making it suitable for search operations.
引用
收藏
页码:156 / 161
页数:6
相关论文
共 50 条
  • [21] Information Retrieval Using Hadoop Big Data Analysis
    Motwani, Deepak
    Madan, Madan Lal
    [J]. ADVANCES IN OPTICAL SCIENCE AND ENGINEERING, 2015, 166 : 409 - 415
  • [22] Indexing in Big Data
    Nashipudimath, Madhu M.
    Shinde, Subhash K.
    [J]. COMPUTING, COMMUNICATION AND SIGNAL PROCESSING, ICCASP 2018, 2019, 810 : 133 - 142
  • [23] Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data
    Hussein, Eslam
    Sadiki, Ronewa
    Jafta, Yahlieel
    Sungay, Muhammad Mujahid
    Ajayi, Olasupo
    Bagula, Antoine
    [J]. E-INFRASTRUCTURE AND E-SERVICES FOR DEVELOPING COUNTRIES (AFRICOMM 2019), 2020, 311 : 180 - 185
  • [24] Demonetization-Twitter Data Analysis using Big Data & Hadoop
    Goyal, Malvika
    Anuranjana
    [J]. PROCEEDINGS 2019 AMITY INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AICAI), 2019, : 156 - 158
  • [25] Application of Big Data for Medical Data Analysis Using Hadoop Environment
    Roobini, M. S.
    Lakshmi, M.
    [J]. INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 1128 - 1135
  • [26] Analyzing Viral Genomic Data Using Hadoop Framework in Big Data
    Nagpal, Disha
    Sood, Shriya
    Mohagaonkar, Sanika
    Sharma, Himanshu
    Saxena, Ankur
    [J]. PROCEEDINGS OF THE 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2019, : 680 - 685
  • [27] Hadoop Multi Node Cluster Resource Analysis
    Pandey, Kapil
    Gadwal, Anand
    Lakkadwala, Prashant
    [J]. 2016 SYMPOSIUM ON COLOSSAL DATA ANALYSIS AND NETWORKING (CDAN), 2016,
  • [28] Point cloud indexing using Big Data technologies
    Kocon, Kevin
    Bormann, Pascal
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 109 - 119
  • [29] Implementation of Dynamic Node Management in Hadoop Cluster
    Ryu, Wooseok
    [J]. 2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2018, : 372 - 373
  • [30] Multiple complementary inverted indexing based on multiple metrics
    Zhang, Kai
    Zhou, Wengang
    Sun, Shaoyan
    Li, Bin
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (06) : 7727 - 7747