Query Answering On Uncertain Big RDF Data Using Apache Spark Framework

被引：0

作者：

Benbernou, Salima ^{[1
]}

Ouziri, Mourad ^{[1
]}

机构：

[1] Univ Paris 05, Univ Sorbonnes Paris Cite, Paris, France

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2018年

关键词：

Big data; Uncertainty; Probabilistic RDF; Query answering; Apache Spark ecosystem;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Data is often associated with uncertainty because the fusion of conflicting data sources, measurement inaccuracy, sampling discrepancy, outdated data sources. We address the problem of query answering over uncertain big data sources using Resource Description Framework (RDF) and ontologies, while computing the exact uncertainty measure of the answer. Therefore, the probability is embraced along the reasoning process when answering. the query. In this paper, we introduce a probabilistic approach for answering user queries that computes complete results by exploiting uncertain knowledge on data sources. We have designed algorithms that are ontological rules based to infer implicit data by combining saturation and query rewriting reasoning. To handle big data the algorithms are spark-based implementation.

引用

页码：4854 / 4860

页数：7

共 50 条

[31] Efficient indexing RDF query algorithm for big data
Zeng, Yiqun
Wang, Jingbin
[J]. MACHINERY ELECTRONICS AND CONTROL ENGINEERING III, 2014, 441 : 691 - 694
[32] A hierarchical indexing strategy for optimizing Apache Spark with HDFS to efficiently query big geospatial raster data
Hu, Fei
Yang, Chaowei
Jiang, Yongyao
Li, Yun
Song, Weiwei
Duffy, Daniel Q.
Schnase, John L.
Lee, Tsengdar
[J]. INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2020, 13 (03) : 410 - 428
[33] Completeness Statements about RDF Data Sources and Their Use for Query Answering
Darari, Fariz
Nutt, Werner
Pirro, Giuseppe
Razniewski, Simon
[J]. SEMANTIC WEB - ISWC 2013, PART I, 2013, 8218 : 66 - 83
[34] Big Data in metagenomics: Apache Spark vs MPI
Abuin, Jose M.
Lopes, Nuno
Ferreira, Luis
Pena, Tomas F.
Schmidt, Bertil
[J]. PLOS ONE, 2020, 15 (10):
[35] Static and Dynamic Big Data Partitioning on Apache Spark
Bertolucci, Massimiliano
Carlini, Emanuele
Dazzi, Patrizio
Lulli, Alessandro
Ricci, Laura
[J]. PARALLEL COMPUTING: ON THE ROAD TO EXASCALE, 2016, 27 : 489 - 498
[36] Scalable Manifold Learning for Big Data with Apache Spark
Schoeneman, Frank
Zola, Jaroslaw
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 272 - 281
[37] Accelerating Apache Spark Big Data Analysis with FPGAs
Ghasemi, Ehsan
Chow, Paul
[J]. 2016 INT IEEE CONFERENCES ON UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING AND COMMUNICATIONS, CLOUD AND BIG DATA COMPUTING, INTERNET OF PEOPLE, AND SMART WORLD CONGRESS (UIC/ATC/SCALCOM/CBDCOM/IOP/SMARTWORLD), 2016, : 737 - 744
[38] Accelerating Apache Spark Big Data Analysis with FPGAs
Ghasemi, Ehsan
Chow, Paul
[J]. 2016 IEEE 24TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2016, : 94 - 94
[39] A Big Data Analysis Platform for Healthcare on Apache Spark
Zhang, Jinwei
Zhang, Yong
Hu, Qingcheng
Tian, Hongliang
Xing, Chunxiao
[J]. SMART HEALTH, ICSH 2016, 2017, 10219 : 32 - 43
[40] Apache Spark: A Unified Engine for Big Data Processing
Zaharia, Matei
Xin, Reynold S.
Wendell, Patrick
Das, Tathagata
Armbrust, Michael
Dave, Ankur
Meng, Xiangrui
Rosen, Josh
Venkataraman, Shivaram
Franklin, Michael J.
Ghodsi, Ali
Gonzalez, Joseph
Shenker, Scott
Stoica, Ion
[J]. COMMUNICATIONS OF THE ACM, 2016, 59 (11) : 56 - 65

← 1 2 3 4 5 →