Linked Data Partitioning for RDF Processing on Apache Spark

被引：0

作者：

Atashkar, Amir Hossein ^{[1
]}

Ghadiri, Nasser ^{[1
]}

Joodaki, Mehdi ^{[1
]}

机构：

[1] Isfahan Univ Technol, Dept Elect & Comp Engn, Esfahan, Iran

来源：

2017 3RD INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR) | 2017年

关键词：

Linked data; scalable algorithms; NoSQL; big data;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

RDF models are widely used in the web of data due to their flexibility and similarity to graph patterns. Because of the growing use of RDFs, their volumes and contents are increasing. Therefore, processing of such massive amount of data on a single machine is not efficient enough, because of the response time and limited hardware resources. A common approach to overcome this limitation is cluster processing and huge datasets could benefit distributed cluster processing on Apache Hadoop. Because of using too much of hard disks, the processing time is usually inadequate. In this paper, we propose a partitiong approach based on Apache Spark for rapid processing of RDF data models. A key feature of Apache Spark is using main memory instead of hard disk, so the speed of data processing in our method is improved. We have evaluated the proposed method by runing SQL queris on RDF data which partitioned on the cluster and demonstrates improved performance.

引用

页码：73 / 77

页数：5

共 50 条

[1] Data Partitioning Scheme for Efficient Distributed RDF Querying Using Apache Spark
Hassan, Mahmudul
Bansal, Srividya K.
[J]. 2019 13TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2019, : 24 - 31
[2] Incremental Data Partitioning of RDF Data in SPARK
Agathangelos, Giannis
Troullinou, Georgia
Kondylakis, Haridimos
Stefanidis, Kostas
Plexousakis, Dimitris
[J]. SEMANTIC WEB: ESWC 2018 SATELLITE EVENTS, 2018, 11155 : 50 - 54
[3] Efficiently Processing and Storing Library Linked Data using Apache Spark and Parquet
Sharma, Kumar
Marjit, Ujjal
Biswas, Utpal
[J]. INFORMATION TECHNOLOGY AND LIBRARIES, 2018, 37 (03) : 29 - 49
[4] Static and Dynamic Big Data Partitioning on Apache Spark
Bertolucci, Massimiliano
Carlini, Emanuele
Dazzi, Patrizio
Lulli, Alessandro
Ricci, Laura
[J]. PARALLEL COMPUTING: ON THE ROAD TO EXASCALE, 2016, 27 : 489 - 498
[5] Big Spatial Data Processing With Apache Spark
Boyi Shangguan
Peng Yue
Wu, Zhaoyan
Jiang, Liangcun
[J]. 2017 6TH INTERNATIONAL CONFERENCE ON AGRO-GEOINFORMATICS, 2017, : 239 - 242
[6] Apache Spark: A Big Data Processing Engine
Shaikh, Eman
Mohiuddin, Iman
Alufaisan, Yasmeen
Nahvi, Irum
[J]. 2019 2ND IEEE MIDDLE EAST AND NORTH AFRICA COMMUNICATIONS CONFERENCE (IEEEMENACOMM'19), 2019, : 220 - 225
[7] Query Answering On Uncertain Big RDF Data Using Apache Spark Framework
Benbernou, Salima
Ouziri, Mourad
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 4854 - 4860
[8] Statement Hypergraph as Partitioning Model for RDF Data Processing
Yuan, Pingpeng
Zhang, Wenya
Jin, Hai
Wu, Buwen
[J]. 2012 IEEE ASIA-PACIFIC SERVICES COMPUTING CONFERENCE (APSCC), 2012, : 138 - 145
[9] Pre-processing of RDF data for METIS partitioning
Benhamed, Siham
Nait-Bahloul, Safia
[J]. International Journal of Metadata, Semantics and Ontologies, 2023, 16 (02) : 152 - 171
[10] Identifying the potential of Near Data Processing for Apache Spark
Awan, Ahsan Javed
Ohara, Moriyoshi
Ayguade, Eduard
Ishizaki, Kazuaki
Brorsson, Mats
Vlassov, Vladimir
[J]. MEMSYS 2017: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, 2017, : 60 - 67

← 1 2 3 4 5 →