Answering ad hoc aggregate queries from data streams using prefix aggregate trees

被引:0
|
作者
Moonjung Cho
Jian Pei
Ke Wang
机构
[1] State University of New York at Buffalo,Department of Computer Science and Engineering
[2] Simon Fraser University,School of Computing Science
[3] 8888 University Drive,undefined
来源
关键词
Data warehousing; Data cube; Data stream; Online analytic processing (OLAP); Aggregate query;
D O I
暂无
中图分类号
学科分类号
摘要
In some business applications such as trading management in financial institutions, it is required to accurately answer ad hoc aggregate queries over data streams. Materializing and incrementally maintaining a full data cube or even its compression or approximation over a data stream is often computationally prohibitive. On the other hand, although previous studies proposed approximate methods for continuous aggregate queries, they cannot provide accurate answers. In this paper, we develop a novel prefix aggregate tree (PAT) structure for online warehousing data streams and answering ad hoc aggregate queries. Often, a data stream can be partitioned into the historical segment, which is stored in a traditional data warehouse, and the transient segment, which can be stored in a PAT to answer ad hoc aggregate queries. The size of a PAT is linear in the size of the transient segment, and only one scan of the data stream is needed to create and incrementally maintain a PAT. Although the query answering using PAT costs more than the case of a fully materialized data cube, the query answering time is still kept linear in the size of the transient segment. Our extensive experimental results on both synthetic and real data sets illustrate the efficiency and the scalability of our design.
引用
收藏
页码:301 / 329
页数:28
相关论文
共 50 条
  • [21] Aggregate computation over data streams
    Lin, Xuemin
    Zhang, Ying
    PROGRESS IN WWW RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2008, 4976 : 10 - 25
  • [22] Efficient Aggregate Queries on Location Data with Confidentiality
    Feng, Da
    Zhou, Fucai
    Wang, Qiang
    Wu, Qiyu
    Li, Bao
    SENSORS, 2022, 22 (13)
  • [23] The Semantics of Aggregate Queries in Data Exchange Revisited
    Kolaitis, Phokion G.
    Spezzano, Francesca
    SCALABLE UNCERTAINTY MANAGEMENT, SUM 2013, 2013, 8078 : 233 - 246
  • [24] Approximation algorithms for aggregate queries on uncertain data
    Chen D.
    Chen L.
    Wang J.
    Wu Y.
    Wang J.
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2018, 58 (03): : 231 - 236
  • [25] Interval Estimation for Aggregate Queries on Incomplete Data
    Zhang, An-Zhen
    Li, Jian-Zhong
    Gao, Hong
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2019, 34 (06) : 1203 - 1216
  • [26] Interval Estimation for Aggregate Queries on Incomplete Data
    An-Zhen Zhang
    Jian-Zhong Li
    Hong Gao
    Journal of Computer Science and Technology, 2019, 34 : 1203 - 1216
  • [27] Processing aggregate queries on spatial OLAP data
    Choi, Kenneth
    Luk, Wo-Shun
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2008, 5182 : 125 - 134
  • [28] Evaluation of top-k OLAP queries using aggregate R-trees
    Mamoulis, N
    Bakiras, S
    Kalnis, P
    ADVANCES IN SPATIAL AND TEMPORAL DATABASES, PROCEEDINGS, 2005, 3633 : 236 - 253
  • [29] Selecting and using views to compute aggregate queries
    Afrati, Foto
    Chirkova, Rada
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2011, 77 (06) : 1079 - 1107
  • [30] CubiST++: Evaluating Ad-Hoc CUBE Queries Using Statistics Trees
    Joachim Hammer
    Lixin Fu
    Distributed and Parallel Databases, 2003, 14 : 221 - 254