Query Answering On Uncertain Big RDF Data Using Apache Spark Framework

被引:0
|
作者
Benbernou, Salima [1 ]
Ouziri, Mourad [1 ]
机构
[1] Univ Paris 05, Univ Sorbonnes Paris Cite, Paris, France
关键词
Big data; Uncertainty; Probabilistic RDF; Query answering; Apache Spark ecosystem;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data is often associated with uncertainty because the fusion of conflicting data sources, measurement inaccuracy, sampling discrepancy, outdated data sources. We address the problem of query answering over uncertain big data sources using Resource Description Framework (RDF) and ontologies, while computing the exact uncertainty measure of the answer. Therefore, the probability is embraced along the reasoning process when answering. the query. In this paper, we introduce a probabilistic approach for answering user queries that computes complete results by exploiting uncertain knowledge on data sources. We have designed algorithms that are ontological rules based to infer implicit data by combining saturation and query rewriting reasoning. To handle big data the algorithms are spark-based implementation.
引用
收藏
页码:4854 / 4860
页数:7
相关论文
共 50 条
  • [31] Efficient indexing RDF query algorithm for big data
    Zeng, Yiqun
    Wang, Jingbin
    [J]. MACHINERY ELECTRONICS AND CONTROL ENGINEERING III, 2014, 441 : 691 - 694
  • [32] A hierarchical indexing strategy for optimizing Apache Spark with HDFS to efficiently query big geospatial raster data
    Hu, Fei
    Yang, Chaowei
    Jiang, Yongyao
    Li, Yun
    Song, Weiwei
    Duffy, Daniel Q.
    Schnase, John L.
    Lee, Tsengdar
    [J]. INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2020, 13 (03) : 410 - 428
  • [33] Completeness Statements about RDF Data Sources and Their Use for Query Answering
    Darari, Fariz
    Nutt, Werner
    Pirro, Giuseppe
    Razniewski, Simon
    [J]. SEMANTIC WEB - ISWC 2013, PART I, 2013, 8218 : 66 - 83
  • [34] Big Data in metagenomics: Apache Spark vs MPI
    Abuin, Jose M.
    Lopes, Nuno
    Ferreira, Luis
    Pena, Tomas F.
    Schmidt, Bertil
    [J]. PLOS ONE, 2020, 15 (10):
  • [35] Static and Dynamic Big Data Partitioning on Apache Spark
    Bertolucci, Massimiliano
    Carlini, Emanuele
    Dazzi, Patrizio
    Lulli, Alessandro
    Ricci, Laura
    [J]. PARALLEL COMPUTING: ON THE ROAD TO EXASCALE, 2016, 27 : 489 - 498
  • [36] Scalable Manifold Learning for Big Data with Apache Spark
    Schoeneman, Frank
    Zola, Jaroslaw
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 272 - 281
  • [37] Accelerating Apache Spark Big Data Analysis with FPGAs
    Ghasemi, Ehsan
    Chow, Paul
    [J]. 2016 INT IEEE CONFERENCES ON UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING AND COMMUNICATIONS, CLOUD AND BIG DATA COMPUTING, INTERNET OF PEOPLE, AND SMART WORLD CONGRESS (UIC/ATC/SCALCOM/CBDCOM/IOP/SMARTWORLD), 2016, : 737 - 744
  • [38] Accelerating Apache Spark Big Data Analysis with FPGAs
    Ghasemi, Ehsan
    Chow, Paul
    [J]. 2016 IEEE 24TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2016, : 94 - 94
  • [39] A Big Data Analysis Platform for Healthcare on Apache Spark
    Zhang, Jinwei
    Zhang, Yong
    Hu, Qingcheng
    Tian, Hongliang
    Xing, Chunxiao
    [J]. SMART HEALTH, ICSH 2016, 2017, 10219 : 32 - 43
  • [40] Apache Spark: A Unified Engine for Big Data Processing
    Zaharia, Matei
    Xin, Reynold S.
    Wendell, Patrick
    Das, Tathagata
    Armbrust, Michael
    Dave, Ankur
    Meng, Xiangrui
    Rosen, Josh
    Venkataraman, Shivaram
    Franklin, Michael J.
    Ghodsi, Ali
    Gonzalez, Joseph
    Shenker, Scott
    Stoica, Ion
    [J]. COMMUNICATIONS OF THE ACM, 2016, 59 (11) : 56 - 65