Efficient approximation of correlated sums on data streams

被引:14
|
作者
Ananthakrishna, R
Das, A
Gehrke, J
Korn, F
Muthukrishnan, S
Srivastava, D
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
[2] AT&T Labs Res, Florham Pk, NJ 07932 USA
关键词
correlated aggregates; data streams; approximation; summary structures; a priori error bounds; IP network management;
D O I
10.1109/TKDE.2003.1198391
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many applications such as IP network management, data arrives in streams and queries over those streams need to be processed online using limited storage. Correlated-sum (CS) aggregates are a natural class of queries, formed by composing basic aggregates on (x, y) pairs and are of the form SUM{g(y) : x less than or equal to f(AGG(x))}, where AGG(x) can be any basic aggregate and f(), g() are user-specified functions. CS-aggregates cannot be computed exactly in one pass through a data stream using limited storage; hence, we study the problem of computing approximate CS-aggregates. We guarantee a priori error bounds when AGG(x) can be computed in limited space (e.g., MIN, MAX, AVG), using two variants of Greenwald and Khanna's summary structure for the approximate computation of quantiles. Using real data sets, we experimentally demonstrate that an adaptation of the quantile summary structure uses much less space, and is significantly faster, than a more direct use of the quantile summary structure, for the same a posteriori error bounds. Finally, we prove that, when AGG(x) is a quantile (which cannot be computed over a data stream in limited space), the error of a CS-aggregate can be arbitrarily large.
引用
收藏
页码:569 / 572
页数:4
相关论文
共 50 条
  • [1] EFFICIENT TAIL ESTIMATION FOR SUMS OF CORRELATED LOGNORMALS
    Blanchet, Jose
    Juneja, Sandeep
    Rojas-Nandayapa, Leonardo
    2008 WINTER SIMULATION CONFERENCE, VOLS 1-5, 2008, : 607 - +
  • [2] An Efficient Approximation to the Correlated Nakagami-m Sums and its Application in Equal Gain Diversity Receivers
    Zlatanov, Nikola
    Hadzi-Velkov, Zoran
    Karagiannidis, George K.
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2010, 9 (01) : 302 - 310
  • [3] Maintaining moving sums over data streams
    Wu, Tzu-Chiang
    Chen, Arbee L. P.
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 1077 - 1084
  • [4] Discovering correlated items in data streams
    Sun, Xingzhi
    Chang, Ming
    Li, Xue
    Orlowska, Maria E.
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 260 - +
  • [5] Strategies for Detection of Correlated Data Streams
    Alseghayer, Rakan
    Petrov, Daniel
    Chrysanthis, Panos K.
    PROCEEDINGS OF THE 5TH INTERNATIONAL WORKSHOP ON EXPLORATORY SEARCH IN DATABASES AND THE WEB (EXPLOREDB 2018), 2018,
  • [6] Auditing data streams for correlated glitches
    Dasu, T. (tamr@research.att.com), 1600, Inderscience Enterprises Ltd., 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (03):
  • [7] Efficient simulation of tail probabilities of sums of correlated lognormals
    Asmussen, Soren
    Blanchet, Jose
    Juneja, Sandeep
    Rojas-Nandayapa, Leonardo
    ANNALS OF OPERATIONS RESEARCH, 2011, 189 (01) : 5 - 23
  • [8] Efficient simulation of tail probabilities of sums of correlated lognormals
    Søren Asmussen
    José Blanchet
    Sandeep Juneja
    Leonardo Rojas-Nandayapa
    Annals of Operations Research, 2011, 189 : 5 - 23
  • [9] ε-Approximation to Data Streams in Sensor Networks
    Li, Guohua
    Li, Jianzhong
    Gao, Hong
    2013 PROCEEDINGS IEEE INFOCOM, 2013, : 1663 - 1671
  • [10] Efficient approximation and privacy preservation algorithms for real time online evolving data streams
    Rahul A. Patil
    Pramod D. Patil
    World Wide Web, 2024, 27