Processing RDF Using Hadoop

被引:0
|
作者
Ali, Mehreen [1 ]
Bharat, K. Sriram [1 ]
Ranichandra, C. [1 ]
机构
[1] VIT Univ, Vellore, Tamil Nadu, India
关键词
Semantic Web; Distributed Computing; Map-Reduce Programming; SPARQL; Graph Data; Performance Evaluation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The basic inspiration of the Semantic Web is to broaden the existing human-readable web by encoding some of the semantics of resources in a machine-understandable form. There are various formats and technologies that help in making it possible. These technologies comprise of the Resource Description Framework (RDF), an assortment of data interchange formats like RDF/XML, N3, N-Triples, and representations such as RDF Schema (RDFS) and Web Ontology Language (OWL), all of which help in providing a proper description of concepts, terms and associations in a particular knowledge domain. Presently, there are some existing frameworks for semantic web technologies but they have limitations for large RDF graphs. Thus storing and efficiently querying a large number of RDF triples is a challenging and important problem. We propose a framework which is constructed using Hadoop to store and retrieve massive numbers of RDF triples by taking advantage of the cloud computing paradigm. Hadoop permits the development of reliable, scalable, proficient, cost-effective and distributed computing using very simple Java interfaces. Hadoop comprises of a distributed file system HDFS to stock up RDF data. Hadoop Map Reduce framework is used to answer the queries. MapReduce job divides the input data-set into independent units which are processed in parallel by the map tasks, which then serve as inputs to the reduce tasks. This framework takes care of task scheduling, supervising them and re-execution of the failed tasks. Uniqueness of our approach is its efficient, automatic allocation of data and work across machines and in turn exploiting the fundamental parallelism of the CPU cores. Results confirm that our proposed framework offers multi-fold efficiencies and benefits which include on-demand processing, operational scalability, competence, cost efficiency and local access to enormous data, contrasting the various traditional approaches.
引用
收藏
页码:385 / 394
页数:10
相关论文
共 50 条
  • [31] Hadoop Image Processing Framework
    Vemula, Sridhar
    Crick, Christopher
    [J]. 2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 506 - 513
  • [32] Statistical analysis of multi job processing in Hadoop environment using schedulers
    Prasad, M. S. Guru
    Singh, Prabhdeep
    Taneja, Harsh
    Jain, Amith K.
    Chandrappa, S.
    [J]. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2022, 43 (03): : 497 - 504
  • [33] Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data
    Hussein, Eslam
    Sadiki, Ronewa
    Jafta, Yahlieel
    Sungay, Muhammad Mujahid
    Ajayi, Olasupo
    Bagula, Antoine
    [J]. E-INFRASTRUCTURE AND E-SERVICES FOR DEVELOPING COUNTRIES (AFRICOMM 2019), 2020, 311 : 180 - 185
  • [34] NEAR REAL-TIME PROCESSING OF PROTEOMICS DATA USING HADOOP
    Hillman, Chris
    Ahmad, Yasmeen
    Whitehorn, Mark
    Cobley, Andy
    [J]. BIG DATA, 2014, 2 (01) : 44 - 49
  • [35] Query Rewriting in RDF Stream Processing
    Calbimonte, Jean-Paul
    Mora, Jose
    Corcho, Oscar
    [J]. SEMANTIC WEB: LATEST ADVANCES AND NEW DOMAINS, 2016, 9678 : 486 - 502
  • [36] Parallel Approach in RDF Query Processing
    Vajgl, Marek
    Parenica, Jan
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2016 (ICNAAM-2016), 2017, 1863
  • [37] Reactive Processing of RDF Streams of Events
    Calbimonte, Jean-Paul
    Aberer, Karl
    [J]. SEMANTIC WEB: ESWC 2015 SATELLITE EVENTS, 2015, 9341 : 457 - 468
  • [38] An Adaptive Framework for RDF Stream Processing
    Li, Qiong
    Zhang, Xiaowang
    Feng, Zhiyong
    [J]. WEB AND BIG DATA, APWEB-WAIM 2017, PT I, 2017, 10366 : 427 - 443
  • [39] Distributed processing using cosine similarity for mapping Big Data in Hadoop
    Rojas, A. F.
    Gelvez, N. Y.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2016, 14 (06) : 2857 - 2861
  • [40] 基于Hadoop的大规模RDF语义数据应用平台
    肖宝
    李璞
    胡文君
    韦丽娜
    [J]. 北部湾大学学报, 2017, 32 (01) : 12 - 17