Parallel Processing of Dynamic Continuous Queries over Streaming Data Flows

被引:29
|
作者
Deng, Ze [1 ]
Wu, Xiaoming [1 ]
Wang, Lizhen [1 ,2 ]
Chen, Xiaodao [1 ]
Ranjan, Rajiv [3 ]
Zomaya, Albert [4 ]
Chen, Dan [1 ,5 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
[2] Chinese Acad Sci, Inst Remote Sensing & Digital Earth, Beijing, Peoples R China
[3] CSIRO, ICT Ctr, Computat Informat Div, Sydney, NSW, Australia
[4] Univ Sydney, Sch Informat Technol, Sydney, NSW 2006, Australia
[5] Collaborat & Innovat Ctr Educ Technol, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Streaming data; cell-tree query indexing structure; KDB-Tree; big data computing; data-intensive computing; GPGPU; RANGE QUERIES; POSITIONS; INDEX;
D O I
10.1109/TPDS.2014.2311811
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
More and more real-time applications need to handle dynamic continuous queries over streaming data of high density. Conventional data and query indexing approaches generally do not apply for excessive costs in either maintenance or space. Aiming at these problems, this study first proposes a new indexing structure by fusing an adaptive cell and KDB-tree, namely CKDB-tree. A cell-tree indexing approach has been developed on the basis of the CKDB-tree that supports dynamic continuous queries. The approach significantly reduces the space costs and scales well with the increasing data size. Towards providing a scalable solution to filtering massive steaming data, this study has explored the feasibility to utilize the contemporary general-purpose computing on the graphics processing unit (GPGPU). The CKDB-tree-based approach has been extended to operate on both the CPU (host) and the GPU (device). The GPGPU-aided approach performs query indexing on the host while perform streaming data filtering on the device in a massively parallel manner. The two heterogeneous tasks execute in parallel and the latency of streaming data transfer between the host and the device is hidden. The experimental results indicate that (1) CKDB-tree can reduce the space cost comparing to the cell-based indexing structure by 60 percent on average, (2) the approach upon the CKDB-tree outperforms the traditional counterparts upon the KDB-tree by 66, 75 and 79 percent in average for uniform, skewed and hyper-skewed data in terms of update costs, and (3) the GPGPU-aided approach greatly improves the approach upon the CKDB-tree with the support of only a single Kepler GPU, and it provides real-time filtering of streaming data with 2.5M data tuples per second. The massively parallel computing technology exhibits great potentials in streaming data monitoring.
引用
收藏
页码:834 / 846
页数:13
相关论文
共 50 条
  • [1] Parallel processing of continuous queries over data streams
    Safaei, Ali A.
    Haghjoo, Mostafa S.
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2010, 28 (2-3) : 93 - 118
  • [2] Parallel processing of continuous queries over data streams
    Ali A. Safaei
    Mostafa S. Haghjoo
    [J]. Distributed and Parallel Databases, 2010, 28 : 93 - 118
  • [3] DYNAMIC CONTINUOUS QUERY PROCESSING OVER STREAMING DATA
    Ananthi, M.
    Sreedhevi, D. K.
    Sumalatha, M. R.
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY INFORMATION AND COMMUNICATION (ICCPEIC), 2016, : 183 - 187
  • [4] Database-support for Continuous Prediction Queries over Streaming Data
    Akdere, Mert
    Cetintemel, Ugur
    Upfal, Eli
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (01): : 1291 - 1301
  • [5] PSoup: a system for streaming queries over streaming data
    Chandrasekaran, S
    Franklin, MJ
    [J]. VLDB JOURNAL, 2003, 12 (02): : 140 - 156
  • [6] PSoup: a system for streaming queries over streaming data
    Sirish Chandrasekaran
    Michael J. Franklin
    [J]. The VLDB Journal, 2003, 12 : 140 - 156
  • [7] A system for processing continuous queries over infinite data systems
    Vossough, E
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2004, 3180 : 720 - 729
  • [8] SAP: Improving Continuous Top-K Queries Over Streaming Data
    Zhu, Rui
    Wang, Bin
    Yang, Xiaochun
    Zheng, Baihua
    Wang, Guoren
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (06) : 1310 - 1328
  • [9] On processing continuous frequent K-N-match queries for dynamic data over networked data sources
    Shih-Chuan Chiu
    Jiun-Long Huang
    Jen-He Huang
    [J]. Knowledge and Information Systems, 2012, 31 : 547 - 579
  • [10] On processing continuous frequent K-N-match queries for dynamic data over networked data sources
    Chiu, Shih-Chuan
    Huang, Jiun-Long
    Huang, Jen-He
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 31 (03) : 547 - 579