A distributed B+Tree indexing method for processing range queries over streaming data

被引:0
|
作者
Shahab Safaee
Meghdad Mirabi
Amir Masoud Rahmani
Ali Asghar Safaei
机构
[1] Islamic Azad University,Department of Computer Engineering, Faculty of Engineering, South Tehran Branch
[2] National Yunlin University of Science and Technology,Future Technology Research Center
[3] Tarbiat Modares University,Department of Medical Informatics, Faculty of Medical Sciences
来源
Cluster Computing | 2024年 / 27卷
关键词
B+Tree index; Distributed query processing; Map-Reduce model; Range query; Streaming data;
D O I
暂无
中图分类号
学科分类号
摘要
A data stream exhibits as a massive unbounded sequence of data elements continuously generated at a high rate. Stream databases raise new challenges for query processing due to both the streaming nature of data which constantly changes over time and the wider range of queries submitted by the user when compared with the traditional databases. In this paper, we propose a system architecture which includes components for both distributed indexing of streaming data and distributed processing of range queries on streaming data. Instead of creating a large and centralized B+Tree index structure, we create a set of small B+Tree indexes in such a way that a B+Tree index can be created for every partition of streaming data. We also design a distributed range search algorithm which can be used by each individual machine inside a Spark cluster to independently process range queries on each partition of streaming data. By exploiting the proposed system architecture, the process of indexing of streaming data and the process of querying over streaming data can be performed in a distributed and parallel manner. By performing several experiments, we demonstrate that our proposed indexing method is scalable and efficient for processing range queries on streaming data compared to the existing centralized B+Tree indexing methods and therefore, it can be used for applications involving data streams with a large volume of data elements and a large number of range queries.
引用
收藏
页码:1251 / 1274
页数:23
相关论文
共 50 条
  • [31] Maximizing throughput for queries over streaming sensor data
    Gomes, Joseph
    Choi, Hyeong-Ah
    2006 IEEE INTERNATIONAL CONFERENCE ON MOBILE ADHOC AND SENSOR SYSTEMS, VOLS 1 AND 2, 2006, : 552 - +
  • [32] A framework for multidimensional skyline queries over streaming data
    Alami, Karim
    Maabout, Sofian
    DATA & KNOWLEDGE ENGINEERING, 2020, 127 (127)
  • [33] CRB-tree: An efficient indexing scheme for range-aggregate queries
    Govindarajan, S
    Agarwal, PK
    Arge, L
    DATABASE THEORY ICDT 2003, PROCEEDINGS, 2003, 2572 : 143 - 157
  • [34] Quadrant-Based Minimum Bounding Rectangle-Tree Indexing Method for Similarity Queries over Big Spatial Data in HBase
    Jo, Bumjoon
    Jung, Sungwon
    SENSORS, 2018, 18 (09)
  • [35] An Effective Clustering Method over CF+ Tree Using Multiple Range Queries
    Ryu, Hyeong-Cheol
    Jung, Sungwon
    Pramanik, Sakti
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (09) : 1694 - 1706
  • [36] An Indexing Method of Continuous Spatiotemporal Queries for Stream Data Processing Rules of Detected Target Objects
    Rahman, Muhammad Habibur
    Hong, Bonghee
    Setiawan, Hari
    Lee, Sanghyun
    Lim, Dongjun
    Kim, Woochan
    SENSORS, 2021, 21 (23)
  • [37] Adaptive Processing of Spatial-Keyword Data Over a Distributed Streaming Cluster
    Mahmood, Ahmed R.
    Daghistani, Anas
    Aly, Ahmed M.
    Tang, Mingjie
    Basalamah, Saleh
    Prabhakar, Sunil
    Aref, Walid G.
    26TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2018), 2018, : 219 - 228
  • [38] Optimizing monitoring queries over distributed data
    Neven, Frank
    Van de Craen, Dieter
    ADVANCES IN DATABASE TECHNOLOGY - EDBT 2006, 2006, 3896 : 829 - +
  • [39] Processing SPARQL queries over distributed RDF graphs
    Peng Peng
    Lei Zou
    M. Tamer Özsu
    Lei Chen
    Dongyan Zhao
    The VLDB Journal, 2016, 25 : 243 - 268
  • [40] Processing SPARQL queries over distributed RDF graphs
    Peng, Peng
    Zou, Lei
    Ozsu, M. Tamer
    Chen, Lei
    Zhao, Dongyan
    VLDB JOURNAL, 2016, 25 (02): : 243 - 268