On Scale Independence for Querying Big Data

被引:0
|
作者
Fan, Wenfei [1 ,2 ]
Geerts, Floris [3 ]
Libkin, Leonid [4 ]
机构
[1] Univ Edinburgh, Edinburgh, Midlothian, Scotland
[2] Beihang Univ, RCDB & SKLSDE Lab, Beijing, Peoples R China
[3] Univ Antwerp, Dept Math & Comp Sci, Antwerp, Belgium
[4] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
基金
英国工程与自然科学研究理事会;
关键词
Scale independence; big data; query answering;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To make query answering feasible in big datasets, practitioners have been looking into the notion of scale independence of queries. Intuitively, such queries require only a relatively small subset of the data, whose size is determined by the query and access methods rather than the size of the dataset itself. This paper aims to formalize this notion and study its properties. We start by defining what it means to be scale-independent, and provide matching upper and lower bounds for checking scale independence, for queries in various languages, and for combined and data complexity. Since the complexity turns out to be rather high, and since scale-independent queries cannot be captured syntactically, we develop sufficient conditions for scale independence. We formulate them based on access schemas, which combine indexing and constraints together with bounds on the sizes of retrieved data sets. We then study two variations of scale-independent query answering, inspired by existing practical systems. One concerns incremental query answering: we check when query answers can be maintained in response to updates scale-independently. The other explores scale-independent query rewriting using views.
引用
收藏
页码:51 / 62
页数:12
相关论文
共 50 条
  • [1] Querying Big Data by Accessing Small Data
    Fan, Wenfei
    Geerts, Floris
    Cao, Yang
    Deng, Ting
    Lu, Ping
    [J]. PODS'15: PROCEEDINGS OF THE 33RD ACM SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2015, : 173 - 184
  • [2] PathGraph: Querying and Exploring Big Data Graphs
    Colazzo, Dario
    Mecca, Vincenzo
    Nole, Maurizio
    Sartiani, Carlo
    [J]. 30TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM 2018), 2018,
  • [3] Describing and Comparing Big Data Querying Tools
    Rodrigues, Mario
    Santos, Maribel Yasmina
    Bernardino, Jorge
    [J]. RECENT ADVANCES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1, 2017, 569 : 115 - 124
  • [4] Querying Big Data from a Database Perspective
    Zhao, Wenfeng
    Liu, Guohua
    Chen, Zhao
    Nyabuga, Douglas
    Yang, Huichun
    Zhang, Heng
    Ni, Mengfei
    [J]. 2017 4TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2017, : 1433 - 1437
  • [5] Querying Big Data: Bridging Theory and Practice
    樊文飞
    怀进鹏
    [J]. Journal of Computer Science & Technology, 2014, 29 (05) : 849 - 869
  • [6] Semantic Querying Big and Distributed RDF Data
    Kaoutar, Lamrani
    Abderrahim, Ghadi
    Kudagba, Florent Kunale
    [J]. PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON SMART CITY APPLICATIONS (SCA'18), 2018,
  • [7] Querying Big Data: Bridging Theory and Practice
    Fan, Wenfei
    Huai, Jin-Peng
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2014, 29 (05) : 849 - 869
  • [8] Querying Big Data: Bridging Theory and Practice
    Wenfei Fan
    Jin-Peng Huai
    [J]. Journal of Computer Science and Technology, 2014, 29 : 849 - 869
  • [10] ITISS: an efficient framework for querying big temporal data
    Chen, Zhongpu
    Yao, Bin
    Wang, Zhi-Jie
    Zhang, Wei
    Zheng, Kai
    Kalnis, Panos
    Tang, Feilong
    [J]. GEOINFORMATICA, 2020, 24 (01) : 27 - 59