A Data Cleaning Method on Massive Spatio-Temporal Data

被引:5
|
作者
Ding, Weilong [1 ,2 ]
Cao, Yaqi [1 ,2 ]
机构
[1] North China Univ Technol, Data Engn Inst, Beijing 100144, Peoples R China
[2] Beijing Key Lab Integrat & Anal Large Scale Strea, Beijing 100144, Peoples R China
来源
关键词
Data cleaning; Spatio-temporal data; Clustering; Filtering; Hadoop; BIG DATA; VIOLATIONS;
D O I
10.1007/978-3-319-49178-3_13
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In open conditions of Internet of Things, massive data would be rapidly accumulated from sensors in low quality. On huge size raw data, the correction for consistency is time-consuming and inaccurate to achieve, and the validation for legality is difficult to guarantee without prior knowledge. In this paper, time-based clustering and rule-based filtering for data cleaning is proposed on massive bus IC card data, which guarantees the consistency and legality among spatio-temporal attributes. Implemented through Hadoop MapReduce and evaluated on real data set, our method shows its efficiency and accuracy in extensive conditions.
引用
收藏
页码:173 / 182
页数:10
相关论文
共 50 条
  • [1] A Data Cleaning Service on Massive Spatio-Temporal Data in Highway Domain
    Xia, Yanqing
    Wang, Xuefei
    Ding, Weilong
    [J]. SERVICE-ORIENTED COMPUTING, ICSOC 2018, 2019, 11434 : 229 - 240
  • [2] CurrentClean: Spatio-temporal Cleaning of Stale Data
    Milani, Mostafa
    Zheng, Zheng
    Chiang, Fei
    [J]. 2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 172 - 183
  • [3] Massive GIS Spatio-temporal Data Storage Method in Cloud Environment
    Yu, Bin
    Zhang, Chen
    Sun, Jiangyan
    Zhang, Yu
    [J]. PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 105 - 109
  • [4] Window Query and Analysis on Massive Spatio-Temporal Data
    Wang, Huan
    Deng, Junhui
    Yuan, Guodong
    [J]. INTERNATIONAL CONFERENCE ON FUTURE INFORMATION ENGINEERING (FIE 2014), 2014, 10 : 138 - 143
  • [5] Massive Spatio-Temporal Mobility Data: An Empirical Experience on Data Management Techniques
    Di Martino, Sergio
    Vitale, Vincenzo Norman
    [J]. WEB AND WIRELESS GEOGRAPHICAL INFORMATION SYSTEMS (W2GIS 2020), 2020, 12473 : 41 - 54
  • [6] A Spatio-Temporal Linked Data Representation for Modeling Spatio-Temporal Dialect Data
    Scholz, Johannes
    Hrastnig, Emanual
    Wandl-Vogt, Eveline
    [J]. PROCEEDINGS OF WORKSHOPS AND POSTERS AT THE 13TH INTERNATIONAL CONFERENCE ON SPATIAL INFORMATION THEORY (COSIT 2017), 2018, : 275 - 282
  • [7] Multi-Resolution Filters for Massive Spatio-Temporal Data
    Jurek, Marcin
    Katzfuss, Matthias
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2021, 30 (04) : 1095 - 1110
  • [8] HGST: A Hilbert-GeoSOT Spatio-Temporal Meshing and Coding Method for Efficient Spatio-Temporal Range Query on Massive Trajectory Data
    Liu, Hong
    Yan, Jining
    Wang, Jinlin
    Chen, Bo
    Chen, Meng
    Huang, Xiaohui
    [J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2023, 12 (03)
  • [9] Statistics for Spatio-Temporal Data
    Mills, Jeff
    [J]. JOURNAL OF REGIONAL SCIENCE, 2012, 52 (03) : 512 - 513
  • [10] Mining spatio-temporal data
    Gennady Andrienko
    Donato Malerba
    Michael May
    Maguelonne Teisseire
    [J]. Journal of Intelligent Information Systems, 2006, 27 : 187 - 190