Indexing Metric Uncertain Data for Range Queries

被引:8
|
作者
Chen, Lu [1 ]
Gao, Yunjun [1 ,2 ]
Li, Xinhan [1 ]
Jensen, Christian S. [3 ]
Chen, Gang [1 ]
Zheng, Baihua [4 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou, Zhejiang, Peoples R China
[2] Zhejiang Univ, Innovat Joint Res Ctr Cyber Phys Soc Syst, Hangzhou, Zhejiang, Peoples R China
[3] Aalborg Univ, Dept Comp Sci, Aalborg, Denmark
[4] Singapore Management Univ, Sch Informat Syst, Singapore, Singapore
关键词
Range query; Uncertain data; Metric space; Index structure; NEAREST-NEIGHBOR SEARCH;
D O I
10.1145/2723372.2723728
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Range queries in metric spaces have applications in many areas such as multimedia retrieval, computational biology, and location-based services, where metric uncertain data exists in different forms, resulting from equipment limitations, high-throughput sequencing technologies, privacy preservation, or others. In this paper, we represent metric uncertain data by using an object-level model and a bi-level model, respectively. Two novel indexes, the uncertain pivot B+ -tree (UPB-tree) and the uncertain pivot B+- forest (UPB-forest), are proposed accordingly in order to support probabilistic range queries w.r.t. a wide range of uncertain data types and similarity metrics. Both index structures use a small set of effective pivots chosen based on a newly defined criterion, and employ the B+ -tree(s) as the underlying index. By design, they are easy to be integrated into any existing DBMS. In addition, we present efficient metric probabilistic range query algorithms, which utilize the validation and pruning techniques based on our derived probability lower and upper bounds. Extensive experiments with both real and synthetic data sets demonstrate that, compared against existing state-of-the-art indexes for metric uncertain data, the UPB-tree and UPB-forest incur much lower construction costs, consume smaller storage spaces, and can support more efficient metric probabilistic range queries.
引用
收藏
页码:951 / 965
页数:15
相关论文
共 50 条
  • [21] Hierarchical Bitmap Indexing for Range Queries on Multidimensional Arrays
    Krcal, Lubos
    Ho, Shen-Shyang
    Holub, Jan
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT I, 2022, : 509 - 525
  • [22] Efficient Range Queries over Uncertain Strings
    Dai, Dongbo
    Xie, Jiang
    Zhang, Huiran
    Dong, Jiaqi
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, SSDBM 2012, 2012, 7338 : 75 - 95
  • [23] Indexing XML data for path expression queries
    Hu, G
    Tang, C
    SOFTWARE ENGINEERING RESEARCH AND APPLICATIONS, 2004, 3026 : 332 - 348
  • [24] A distributed B plus Tree indexing method for processing range queries over streaming data
    Safaee, Shahab
    Mirabi, Meghdad
    Rahmani, Amir Masoud
    Safaei, Ali Asghar
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (02): : 1251 - 1274
  • [25] A distributed B+Tree indexing method for processing range queries over streaming data
    Shahab Safaee
    Meghdad Mirabi
    Amir Masoud Rahmani
    Ali Asghar Safaei
    Cluster Computing, 2024, 27 : 1251 - 1274
  • [26] Buoy indexing of metric feature spaces for fast approximate image queries
    Volmer, S
    MULTIMEDIA 2001, PROCEEDINGS, 2002, : 131 - 140
  • [27] A survey of queries over uncertain data
    Yijie Wang
    Xiaoyong Li
    Xiaoling Li
    Yuan Wang
    Knowledge and Information Systems, 2013, 37 : 485 - 530
  • [28] A survey of queries over uncertain data
    Wang, Yijie
    Li, Xiaoyong
    Li, Xiaoling
    Wang, Yuan
    KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 37 (03) : 485 - 530
  • [29] Probabilistic MaxRS Queries on Uncertain Data
    Nakayama, Yuki
    Amagata, Daichi
    Hara, Takahiro
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2017, PT I, 2017, 10438 : 111 - 119
  • [30] Indexing Multi-Metric Data
    Franzke, Maximilian
    Emrich, Tobias
    Zufle, Andreas
    Renz, Matthias
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1122 - 1133