FAST: Frequency-Aware Indexing for Spatio-Textual Data Streams

被引:17
|
作者
Mahmood, Ahmed R. [1 ]
Aly, Ahmed M. [2 ]
Aref, Walid G. [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
[2] Google Inc, Mountain View, CA USA
关键词
D O I
10.1109/ICDE.2018.00036
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many applications need to process massive streams of spatio-textual data in real-time against continuous spatio-textual queries. For example, in location-aware ad targeting publish/subscribe systems, it is required to disseminate millions of ads and promotions to millions of users based on the locations and textual profiles of users. In this paper, we study indexing of continuous spatio-textual queries. There exist several related spatio-textual indexes that typically integrate a spatial index with a textual index. However, these indexes usually have a high demand for main-memory and assume that the entire vocabulary of keywords is known in advance. Also, these indexes do not successfully capture the variations in the frequencies of keywords across different spatial regions and treat frequent and infrequent keywords in the same way. Moreover, existing indexes do not adapt to the changes in workload over space and time. For example, some keywords may be trending at certain times in certain locations and this may change as time passes. This affects the indexing and searching performance of existing indexes significantly. In this paper, we introduce FAST, a Frequency-Aware Spatio-Textual index for continuous spatio-textual queries. FAST is a main-memory index that requires up to one third of the memory needed by the state-of-the-art index. FAST does not assume prior knowledge of the entire vocabulary of indexed objects. FAST adaptively accounts for the difference in the frequencies of keywords within their corresponding spatial regions to automatically choose the best indexing approach that optimizes the insertion and search times. Extensive experimental evaluation using real and synthetic datasets demonstrates that FAST is up to 3x faster in search time and 5x faster in insertion time than the state-of-the-art indexes.
引用
收藏
页码:305 / 316
页数:12
相关论文
共 32 条
  • [21] Privacy-preserving Boolean range query with verifiability and forward security over spatio-textual data
    Ge, Xinrui
    Yu, Jia
    Kong, Fanyu
    INFORMATION SCIENCES, 2024, 677
  • [22] A frequency-aware data-centric mechanism for wireless sensor networks
    Chang, Chih-Yung
    Sheu, Jang-Ping
    Chang, Sheng-Wen
    Chen, Yu-Chieh
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2010, 10 (08): : 1078 - 1101
  • [23] FAST: A Frequency-Aware Skewed Merkle Tree for FPGA-Secured Embedded Systems
    Zou, Yu
    Lin, Mingjie
    2019 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2019), 2019, : 327 - 332
  • [24] Pairwise Location-Aware Publish/Subscribe for Geo-Textual Data Streams
    Zhong, Ying
    Zhu, Shunzhi
    Wang, Yan
    Li, Jianmin
    Zhang, Xinxin
    Shang, Jedi S.
    IEEE ACCESS, 2020, 8 : 211704 - 211713
  • [25] Learning frequency-aware convolutional neural network for spatio-temporal super-resolution water surface waves
    Peng, Chen
    Tu, Zaili
    Qiu, Sheng
    Li, Chen
    Wang, Changbo
    Qin, Hong
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (06)
  • [26] A Context-aware Framework for ML Models on Spatio-temporal Data Streams
    Elmamooz, Golnaz
    2021 22ND IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2021), 2021, : 261 - 263
  • [27] Cymo: A Storage Model with Query-Aware Indexing for Spatio-Temporal Big Data
    Guo, Yang
    Shao, Zili
    2022 IEEE 42ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2022), 2022, : 122 - 132
  • [28] Tree Based Fast Similarity Query Search Indexing on Outsourced Cloud Data Streams
    Balasubramanian, Balamurugan
    Durai, Kamalraj
    Sathyanarayanan, Jegadeeswari
    Muthukumarasamy, Sugumaran
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2019, 16 (05) : 871 - 878
  • [29] OrderSketch: An Unbiased and Fast Sketch for Frequency Estimation of Data Streams
    Jie, Lu
    Chen Hongchang
    Sun Penghao
    Tao, Hu
    Zhen, Zhang
    COMPUTER NETWORKS, 2021, 201
  • [30] Content-aware DataGuides: Interleaving IR and DB indexing techniques for efficient retrieval of textual XML data
    Weigel, F
    Meuss, H
    Bry, F
    Schulz, KU
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2004, 2997 : 378 - 393