Efficient Approximation of Certain and Possible Answers for Ranking and Window Queries over Uncertain Data

被引:0
|
作者
Feng, Su [1 ]
Glavic, Boris [1 ]
Kennedy, Oliver [2 ]
机构
[1] Illinois Inst Technol, Chicago, IL 60616 USA
[2] SUNY Buffalo, Buffalo, NY USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2023年 / 16卷 / 06期
关键词
DATABASES; AGGREGATION; INFORMATION;
D O I
10.14778/3583140.3583151
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Uncertainty arises naturally in many application domains due to, e.g., data entry errors and ambiguity in data cleaning. Prior work in incomplete and probabilistic databases has investigated the semantics and efficient evaluation of ranking and top-k queries over uncertain data. However, most approaches deal with top-k and ranking in isolation and do represent uncertain input data and query results using separate, incompatible data models. We present an efficient approach for under- and over-approximating results of ranking, top-k, and window queries over uncertain data. Our approach integrates well with existing techniques for querying uncertain data, is efficient, and is to the best of our knowledge the first to support windowed aggregation. We design algorithms for physical operators for uncertain sorting and windowed aggregation, and implement them in PostgreSQL. We evaluated our approach on synthetic and real world datasets, demonstrating that it outperforms all competitors, and often produces more accurate results.
引用
收藏
页码:1346 / 1358
页数:13
相关论文
共 50 条
  • [31] Querying Incomplete Numerical Data: Between Certain and Possible Answers
    Console, Marco
    Libkin, Leonid
    Peterfreund, Liat
    [J]. PROCEEDINGS OF THE 42ND ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, PODS 2023, 2023, : 349 - 358
  • [32] EFFICIENT SECONDARY MEMORY PROCESSING OF WINDOW QUERIES ON SPATIAL DATA
    NARDELLI, E
    PROIETTI, G
    [J]. INFORMATION SCIENCES, 1995, 84 (1-2) : 67 - 83
  • [33] On concurrency control in sliding window queries over data streams
    Golab, Lukasz
    Bijay, Kumar Gaurav
    Ozsu, M. Tamer
    [J]. ADVANCES IN DATABASE TECHNOLOGY - EDBT 2006, 2006, 3896 : 608 - 626
  • [34] Load shedding for window queries over continuous data streams
    Kim, Kwang Rak
    Kim, Hyeon Gyu
    [J]. Lecture Notes in Electrical Engineering, 2015, 373 : 159 - 164
  • [35] Trustworthy answers for top-k queries on uncertain Big Data in decision making
    Nguyen, H. T. H.
    Cao, J.
    [J]. INFORMATION SCIENCES, 2015, 318 : 73 - 90
  • [36] SPHLU:An Efficient Algorithm for Processing PRkNN Queries on Uncertain Data
    WANG Shengsheng
    LI Yang
    CHAI Sheng
    BOLOU Bolou Dickson
    [J]. Chinese Journal of Electronics, 2016, 25 (03) : 403 - 406
  • [37] Debugging Missing Answers for Spark Queries over Nested Data with Breadcrumb
    Diestelkaemper, Ralf
    Lee, Seokki
    Glavic, Boris
    Herschel, Melanie
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (12): : 2731 - 2734
  • [38] Sliding-Window Probabilistic Threshold Aggregate Queries on Uncertain Data Streams
    Chen, Donghui
    Chen, Ling
    [J]. INFORMATION SCIENCES, 2020, 520 (520) : 353 - 372
  • [39] SPHLU: An Efficient Algorithm for Processing PRkNN Queries on Uncertain Data
    Wang Shengsheng
    Li Yang
    Chai Sheng
    Bolou, Bolou Dickson
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2016, 25 (03) : 403 - 406
  • [40] Adaptive Processing for Distributed Skyline Queries over Uncertain Data
    Zhou, Xu
    Li, Kenli
    Zhou, Yantao
    Li, Keqin
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (02) : 371 - 384