Adaptive spatial partitioning for multidimensional data streams

被引:14
|
作者
Hershberger, John
Shrivastava, Nisheeth
Suri, Subhash
Toth, Csaba D.
机构
[1] Mentor Graph Corp, Wilsonville, OR 97070 USA
[2] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
[3] MIT, Dept Math, Cambridge, MA 02139 USA
关键词
multidimensional data stream; summarization; heavy hitters; range query;
D O I
10.1007/s00453-006-0070-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We propose a space-efficient scheme for summarizing multidimensional data streams. Our sketch can be used to solve spatial versions of several classical data stream queries efficiently. For instance, we can track epsilon-hot spots, which are congruent boxes containing at least an epsilon fraction of the stream, and maintain hierarchical heavy hitters in d dimensions. Our sketch can also be viewed as a multidimensional generalization of the epsilon-approximate quantile summary. The space complexity of our scheme is O((1/epsilon) log R) if the points lie in the domain [0, R](d), where d is assumed to be a constant. The scheme extends to the sliding window model with a log (epsilon n) factor increase in space, where n is the size of the sliding window. Our sketch can also be used to answer epsilon-approximate rectangular range queries over a stream of d-dimensional points.
引用
收藏
页码:97 / 117
页数:21
相关论文
共 50 条
  • [31] Data reduction based on spatial partitioning
    Guo, GD
    Wang, H
    Bell, D
    Wu, QX
    [J]. COMPUTATIONAL SCIENCE -- ICCS 2001, PROCEEDINGS PT 2, 2001, 2074 : 245 - 252
  • [32] Spatial Partitioning Algorithms for Data Visualization
    Devulapalli, Raghuveer
    Quist, Mikael
    Carlsson, John Gunnar
    [J]. VISUALIZATION AND DATA ANALYSIS 2014, 2014, 9017
  • [33] A Multidimensional Method for Capturing Spatial Data
    Kollmitzer, Christian
    Schranz, Melanie
    Warum, Manuel
    [J]. ERCIM NEWS, 2021, (124): : 42 - 43
  • [34] Dynamic adaptive data structures for monitoring data streams
    Aguilar-Saborit, J.
    Trancoso, P.
    Muntes-Muleroc, V.
    Larriba-Pey, J. L.
    [J]. DATA & KNOWLEDGE ENGINEERING, 2008, 66 (01) : 92 - 115
  • [35] Multidimensional partitioning and bi-partitioning: analysis and application to gene expression data sets
    Kalna, Gabriela
    Vass, J. Keith
    Higham, Desmond J.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2008, 85 (3-4) : 475 - 485
  • [36] AGAMI: Scalable Visual Analytics over Multidimensional Data Streams
    Lu, Mingxin
    Wong, Edmund
    Barajas, Daniel
    Li, Xiaochen
    Ogundipe, Mosopefoluwa
    Wilson, Nate
    Garg, Pragya
    Joshi, Alark
    Malensek, Matthew
    [J]. 2020 IEEE/ACM INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING, APPLICATIONS AND TECHNOLOGIES (BDCAT 2020), 2020, : 57 - 66
  • [37] Data partitioning over data streams based on change-aware sampling
    Wang, YL
    Xu, HB
    Dong, YS
    Liu, XJ
    Qian, JB
    [J]. ICEBE 2005: IEEE INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING, PROCEEDINGS, 2005, : 582 - 585
  • [38] Enabling Efficient and General Subpopulation Analytics in Multidimensional Data Streams
    Manousis, Antonis
    Cheng, Zhuo
    Ben Basat, Ran
    Liu, Zaoxing
    Sekar, Vyas
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (11): : 3249 - 3262
  • [39] ApproxCCA: An approximate correlation analysis algorithm for multidimensional data streams
    Wang Yongli
    Zhang Gongxuan
    Qian Jiang-Bo
    [J]. KNOWLEDGE-BASED SYSTEMS, 2011, 24 (07) : 952 - 962
  • [40] Spatial prediction and spatial dependence monitoring on georeferenced data streams
    Balzanella, Antonio
    Irpino, Antonio
    [J]. STATISTICAL METHODS AND APPLICATIONS, 2020, 29 (01): : 101 - 128