Sampling Big Trajectory Data for Traversal Trajectory Aggregate Query

被引:6
|
作者
Ding, Yichen [1 ]
Li, Yanhua [2 ]
Zhou, Xun [1 ]
Huang, Zhuojie [3 ]
You, Simin [3 ]
Luo, Jun [4 ]
机构
[1] Univ Iowa, Dept Management Sci, Iowa City, IA 52242 USA
[2] Worcester Polytech Inst, Dept Comp Sci, Worcester, MA 01609 USA
[3] Pitney Bowes Inc, Stamford, CT 06926 USA
[4] Lenovo Grp Ltd, Machine Intelligence Lab, Hong Kong, Peoples R China
基金
美国国家科学基金会;
关键词
Traversal trajectory; aggregate query; importance sampling; INTERESTS; POINTS; SEARCH;
D O I
10.1109/TBDATA.2018.2830780
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper defines and investigates a novel trajectory query, namely, Traversal Trajectory Aggregate (TTA) Query: Given a trajectory database and a pair of upstream and downstream spatio-temporal (ST) regions (i.e., spatial area coupled with a time interval), a TTA query aims to retrieve the total number of unique trajectories that traverse through these two ST regions. Such TTA queries play an important role in various urban applications, such as route planning, taxi dispatching, and location-based advertising. Two baselines can answer such TTA queries: (a) exact search (over the entire ST query regions) can obtain the exact answer, but it leads to extremely long running time when the ST query regions are huge; (b) uniform-sampling-based approaches estimate the query answer with sampled trajectories. However, the uniform sampling distribution may lead to significant estimation variance for TTA query, because traversal trajectories are relatively few and unevenly distributed in the query regions. To tackle these challenges, this paper proposes a novel Targeted Index Sampling (TIS) framework to answer TTA queries with high estimation accuracy. TIS employs a two-stage framework, with a Pilot Sampling Estimation (PSE) stage to estimate the distribution of trajectories in ST query region, and an Integrated Importance Sampling (IIS) stage, which collects trajectories with the importance sampling distribution obtained in PSE, and estimates the query result with an asymptotically unbiased estimator. Extensive experiments and case studies using a large-scale real taxi trajectory dataset from Shenzhen, China demonstrate that our TIS framework achieves $\leq$<= 10 percent estimation error with $\geq$>= 90 percent computational time reduction over exact search, and 50 percent reduction on estimation error (with similar running time) over uniform-distribution-based sampling approaches.
引用
收藏
页码:550 / 563
页数:14
相关论文
共 50 条
  • [1] Towards an Efficient Top-K Trajectory Similarity Query Processing Algorithm for Big Trajectory Data on GPGPUs
    Leal, Eleazar
    Gruenwald, Le
    Zhang, Jianting
    You, Simin
    [J]. 2016 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2016, 2016, : 206 - 213
  • [2] A Survey on Big Data for Trajectory Analytics
    de Almeida, Damao Ribeiro
    Baptista, Claudio de Souza
    de Andrade, Fabio Gomes
    Soares, Amilcar
    [J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2020, 9 (02)
  • [3] Trajectory big data: Data, applications and techniques
    Xu, Jia-Jie
    Zheng, Kai
    Chi, Ming-Min
    Zhu, Yang-Yong
    Yu, Xiao-Hui
    Zhou, Xiao-Fang
    [J]. Tongxin Xuebao/Journal on Communications, 2015, 36 (12):
  • [4] A Hybrid Aggregate Index Method for Trajectory Data
    Shi, Yaqing
    Huang, Song
    Zheng, Changyou
    Ji, Haijin
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2019, 2019
  • [5] Trajectory engine: A backend for trajectory sampling
    Duffield, NG
    Gerber, A
    Grossglauser, M
    [J]. NOMS 2002: IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM: MANAGEMENT SOLUTIONS FOR THE NEW COMMUNICATIONS WORLD, 2002, : 437 - 450
  • [6] Research on trajectory similarity matching model based on spatiotemporal trajectory big data
    Chen, Bin
    Liu, Yunxiang
    Shi, Wei
    [J]. PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 2208 - 2215
  • [7] Query and Animate Multi-attribute Trajectory Data
    Xu, Jianqiu
    Gueting, Ralf Hartmut
    [J]. CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2551 - 2554
  • [8] Multi-Scale Trajectory Data Management and Query
    Tu, Lai
    Wen, Jing
    Huang, Benxiong
    Tan, Dan
    [J]. 2018 16TH IEEE INT CONF ON DEPENDABLE, AUTONOM AND SECURE COMP, 16TH IEEE INT CONF ON PERVAS INTELLIGENCE AND COMP, 4TH IEEE INT CONF ON BIG DATA INTELLIGENCE AND COMP, 3RD IEEE CYBER SCI AND TECHNOL CONGRESS (DASC/PICOM/DATACOM/CYBERSCITECH), 2018, : 143 - 150
  • [9] Distributed top-k similarity query on big trajectory streams
    Zhang, Zhigang
    Qi, Xiaodong
    Wang, Yilin
    Jin, Cheqing
    Mao, Jiali
    Zhou, Aoying
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2019, 13 (03) : 647 - 664
  • [10] Distributed top-k similarity query on big trajectory streams
    Zhigang Zhang
    Xiaodong Qi
    Yilin Wang
    Cheqing Jin
    Jiali Mao
    Aoying Zhou
    [J]. Frontiers of Computer Science, 2019, 13 : 647 - 664