Indexing Metric Uncertain Data for Range Queries

被引:8
|
作者
Chen, Lu [1 ]
Gao, Yunjun [1 ,2 ]
Li, Xinhan [1 ]
Jensen, Christian S. [3 ]
Chen, Gang [1 ]
Zheng, Baihua [4 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou, Zhejiang, Peoples R China
[2] Zhejiang Univ, Innovat Joint Res Ctr Cyber Phys Soc Syst, Hangzhou, Zhejiang, Peoples R China
[3] Aalborg Univ, Dept Comp Sci, Aalborg, Denmark
[4] Singapore Management Univ, Sch Informat Syst, Singapore, Singapore
关键词
Range query; Uncertain data; Metric space; Index structure; NEAREST-NEIGHBOR SEARCH;
D O I
10.1145/2723372.2723728
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Range queries in metric spaces have applications in many areas such as multimedia retrieval, computational biology, and location-based services, where metric uncertain data exists in different forms, resulting from equipment limitations, high-throughput sequencing technologies, privacy preservation, or others. In this paper, we represent metric uncertain data by using an object-level model and a bi-level model, respectively. Two novel indexes, the uncertain pivot B+ -tree (UPB-tree) and the uncertain pivot B+- forest (UPB-forest), are proposed accordingly in order to support probabilistic range queries w.r.t. a wide range of uncertain data types and similarity metrics. Both index structures use a small set of effective pivots chosen based on a newly defined criterion, and employ the B+ -tree(s) as the underlying index. By design, they are easy to be integrated into any existing DBMS. In addition, we present efficient metric probabilistic range query algorithms, which utilize the validation and pruning techniques based on our derived probability lower and upper bounds. Extensive experiments with both real and synthetic data sets demonstrate that, compared against existing state-of-the-art indexes for metric uncertain data, the UPB-tree and UPB-forest incur much lower construction costs, consume smaller storage spaces, and can support more efficient metric probabilistic range queries.
引用
收藏
页码:951 / 965
页数:15
相关论文
共 50 条
  • [41] Uncertain Data Queries Processing in a Probabilistic Framework
    He, Ming
    Du, Yong-ping
    JOURNAL OF COMPUTERS, 2010, 5 (11) : 1663 - 1669
  • [42] Graphical Models for Dependencies and Queries in Uncertain Data
    Chen, Ruiwen
    Kiringa, Iluju
    Mao, Yongyi
    2010 IEEE 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDE 2010), 2010, : 301 - 304
  • [43] Supporting ranking queries on uncertain and incomplete data
    Soliman, Mohamed A.
    Ilyas, Ihab F.
    Ben-David, Shalev
    VLDB JOURNAL, 2010, 19 (04): : 477 - 501
  • [44] Probabilistic spatial queries on existentially uncertain data
    Dai, XY
    Yiu, ML
    Mamoulis, N
    Tao, YF
    Vaitis, M
    ADVANCES IN SPATIAL AND TEMPORAL DATABASES, PROCEEDINGS, 2005, 3633 : 400 - 417
  • [45] Approximation algorithms for aggregate queries on uncertain data
    Chen D.
    Chen L.
    Wang J.
    Wu Y.
    Wang J.
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2018, 58 (03): : 231 - 236
  • [46] Indexing continual range queries for location-aware mobile services
    Wu, KL
    Chen, SK
    Yu, PS
    2004 IEEE INTERNATIONAL CONFERNECE ON E-TECHNOLOGY, E-COMMERE AND E-SERVICE, PROCEEDINGS, 2004, : 233 - 240
  • [47] A privacy-preserved indexing schema in DaaS model for range queries
    Hao R.
    Li J.
    Wu G.
    High Technology Letters, 2020, 26 (04) : 448 - 454
  • [48] Indexing continual range queries for location-aware mobile services
    IBM T.J. Watson Research Center
    IEEE Task Committee on E-Commerce; Fu-Jen University of Taiwan; BIKMrdc of Fu-Jen University; Academia Sinica; National Science Council of Taiwan, 1600, 233-240 (2004):
  • [49] Approximating High-Dimensional Range Queries with kNN Indexing Techniques
    Schuh, Michael A.
    Wylie, Tim
    Liu, Chang
    Angryk, Rafal A.
    COMPUTING AND COMBINATORICS, COCOON 2014, 2014, 8591 : 369 - 380
  • [50] Compact N-Tree: an Indexing Structure for Distance Range Queries
    Najjar, Faiza
    Slimani, Hassenet
    ISCC: 2009 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1 AND 2, 2009, : 212 - +