Indexing Metric Uncertain Data for Range Queries

被引:8
|
作者
Chen, Lu [1 ]
Gao, Yunjun [1 ,2 ]
Li, Xinhan [1 ]
Jensen, Christian S. [3 ]
Chen, Gang [1 ]
Zheng, Baihua [4 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou, Zhejiang, Peoples R China
[2] Zhejiang Univ, Innovat Joint Res Ctr Cyber Phys Soc Syst, Hangzhou, Zhejiang, Peoples R China
[3] Aalborg Univ, Dept Comp Sci, Aalborg, Denmark
[4] Singapore Management Univ, Sch Informat Syst, Singapore, Singapore
关键词
Range query; Uncertain data; Metric space; Index structure; NEAREST-NEIGHBOR SEARCH;
D O I
10.1145/2723372.2723728
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Range queries in metric spaces have applications in many areas such as multimedia retrieval, computational biology, and location-based services, where metric uncertain data exists in different forms, resulting from equipment limitations, high-throughput sequencing technologies, privacy preservation, or others. In this paper, we represent metric uncertain data by using an object-level model and a bi-level model, respectively. Two novel indexes, the uncertain pivot B+ -tree (UPB-tree) and the uncertain pivot B+- forest (UPB-forest), are proposed accordingly in order to support probabilistic range queries w.r.t. a wide range of uncertain data types and similarity metrics. Both index structures use a small set of effective pivots chosen based on a newly defined criterion, and employ the B+ -tree(s) as the underlying index. By design, they are easy to be integrated into any existing DBMS. In addition, we present efficient metric probabilistic range query algorithms, which utilize the validation and pruning techniques based on our derived probability lower and upper bounds. Extensive experiments with both real and synthetic data sets demonstrate that, compared against existing state-of-the-art indexes for metric uncertain data, the UPB-tree and UPB-forest incur much lower construction costs, consume smaller storage spaces, and can support more efficient metric probabilistic range queries.
引用
收藏
页码:951 / 965
页数:15
相关论文
共 50 条
  • [1] Indexing metric uncertain data for range queries and range joins
    Lu Chen
    Yunjun Gao
    Aoxiao Zhong
    Christian S. Jensen
    Gang Chen
    Baihua Zheng
    The VLDB Journal, 2017, 26 : 585 - 610
  • [2] Indexing metric uncertain data for range queries and range joins
    Chen, Lu
    Gao, Yunjun
    Zhong, Aoxiao
    Jensen, Christian S.
    Chen, Gang
    Zheng, Baihua
    VLDB JOURNAL, 2017, 26 (04): : 585 - 610
  • [3] Indexing Uncertain Data for Supporting Range Queries
    Zhu, Rui
    Wang, Bin
    Wang, Guoren
    WEB-AGE INFORMATION MANAGEMENT, WAIM 2014, 2014, 8485 : 72 - 83
  • [4] Range queries on uncertain data
    Li, Jian
    Wang, Haitao
    THEORETICAL COMPUTER SCIENCE, 2016, 609 : 32 - 48
  • [5] Range Queries on Uncertain Data
    Li, Jian
    Wang, Haitao
    ALGORITHMS AND COMPUTATION, ISAAC 2014, 2014, 8889 : 326 - 337
  • [6] Indexing Uncertain Data in General Metric Spaces
    Angiulli, Fabrizio
    Fassetti, Fabio
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (09) : 1640 - 1657
  • [7] Range-max queries on uncertain data
    Agarwal, Pankaj K.
    Kumar, Nirman
    Sintos, Stavros
    Sufi, Subhash
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2018, 94 : 118 - 134
  • [8] Uncertain probabilistic range queries on multidimensional data
    Bernad, Jorge
    Bobed, Carlos
    Mena, Eduardo
    INFORMATION SCIENCES, 2020, 537 (334-367) : 334 - 367
  • [9] Range-Max Queries on Uncertain Data
    Agarwal, Pankaj K.
    Kumar, Nirman
    Sintos, Stavros
    Suri, Subhash
    PODS'16: PROCEEDINGS OF THE 35TH ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2016, : 465 - 476
  • [10] Indexing Uncertain Data
    Agarwal, Pankaj K.
    Cheng, Siu-Wing
    Tao, Yufei
    Yi, Ke
    PODS'09: PROCEEDINGS OF THE TWENTY-EIGHTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2009, : 137 - 146