Automatic User Identification Method across Heterogeneous Mobility Data Sources

被引:0
|
作者
Cao, Wei [1 ,2 ]
Wu, Zhengwei [2 ]
Wang, Dong [1 ]
Li, Jian [1 ]
Wu, Haishan [2 ]
机构
[1] Tsinghua Univ, Inst Interdisciplinary Informat Sci, Beijing, Peoples R China
[2] Baidu Inc, Big Data Lab, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the ubiquity of location based services and applications, large volume of mobility data has been generated routinely, usually from heterogeneous data sources, such as different GPS-embedded devices, mobile apps or location based service providers. In this paper, we investigate efficient ways of identifying users across such heterogeneous data sources. We present a MapReduce-based framework called Automatic User Identification (AUI) which is easy to deploy and can scale to very large data set. Our framework is based on a novel similarity measure called the signal based similarity (SIG) which measures the similarity of users' trajectories gathered from different data sources, typically with very different sampling rates and noise patterns. We conduct extensive experimental evaluations, which show that our framework outperforms the existing methods significantly. Our study on one hand provides an effective approach for the mobility data integration problem on large scale data sets, i.e., combining the mobility data sets from different sources in order to enhance the data quality. On the other hand, our study provides an in-depth investigation for the widely studied human mobility uniqueness problem under heterogeneous data sources.
引用
收藏
页码:978 / 989
页数:12
相关论文
共 50 条
  • [1] Complete Your Mobility: Linking Trajectories Across Heterogeneous Mobility Data Sources
    Guo-Wei Wang
    Jin-Dou Zhang
    Jing Li
    Journal of Computer Science and Technology, 2018, 33 : 792 - 806
  • [2] Complete Your Mobility: Linking Trajectories Across Heterogeneous Mobility Data Sources
    Wang, Guo-Wei
    Zhang, Jin-Dou
    Li, Jing
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2018, 33 (04) : 792 - 806
  • [3] User Identification across Asynchronous Mobility Trajectories
    Qi, Mengjun
    Wang, Zhongyuan
    He, Zheng
    Shao, Zhenfeng
    SENSORS, 2019, 19 (09)
  • [4] Semantic matching across heterogeneous data sources
    Zhao, Huimin
    COMMUNICATIONS OF THE ACM, 2007, 50 (01) : 45 - 50
  • [5] User Location Modeling Based on Heterogeneous Data Sources
    Gottschaemmer, Patrick
    Grosse-Puppendahl, Tobias
    Kuijper, Arjan
    DISTRIBUTED, AMBIENT, AND PERVASIVE INTERACTIONS, 2015, 9189 : 473 - 484
  • [6] ConnectionLens: Finding Connections Across Heterogeneous Data Sources
    Chanial, Camille
    Dziri, Redouane
    Galhardas, Helena
    Leblay, Julien
    Minh-Huong Le Nguyen
    Manolescu, Ioana
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (12): : 2030 - 2033
  • [7] MobHet: Predicting human mobility using heterogeneous data sources
    Silveira, Lucas M.
    de Almeida, Jussara M.
    Marques-Neto, Humberto T.
    Sarraute, Carlos
    Ziviani, Artur
    COMPUTER COMMUNICATIONS, 2016, 95 : 54 - 68
  • [8] Automatic Root Cause Analysis by Integrating Heterogeneous Data Sources
    Richter, Felix
    Aymelek, Tetiana
    Mattfeld, Dirk C.
    OPERATIONS RESEARCH PROCEEDINGS 2015, 2017, : 469 - 474
  • [9] EnAli: entity alignment across multiple heterogeneous data sources
    Chao Kong
    Ming Gao
    Chen Xu
    Yunbin Fu
    Weining Qian
    Aoying Zhou
    Frontiers of Computer Science, 2019, 13 : 157 - 169
  • [10] EnAli: entity alignment across multiple heterogeneous data sources
    Kong, Chao
    Gao, Ming
    Xu, Chen
    Fu, Yunbin
    Qian, Weining
    Zhou, Aoying
    FRONTIERS OF COMPUTER SCIENCE, 2019, 13 (01) : 157 - 169