A Data Cleaning Method for Big Trace Data Using Movement Consistency

被引:10
|
作者
Yang, Xue [1 ]
Tang, Luliang [1 ]
Zhang, Xia [2 ]
Li, Qingquan [3 ]
机构
[1] Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & R, Wuhan 430079, Hubei, Peoples R China
[2] Wuhan Univ, Sch Urban Design, Wuhan 430070, Hubei, Peoples R China
[3] Shenzhen Univ, Coll Civil Engn, Shenzhen 518060, Peoples R China
来源
SENSORS | 2018年 / 18卷 / 03期
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
data cleaning; big data; vehicle trajectory; movement consistency modeling; GPS;
D O I
10.3390/s18030824
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Given the popularization of GPS technologies, the massive amount of spatiotemporal GPS traces collected by vehicles are becoming a new kind of big data source for urban geographic information extraction. The growing volume of the dataset, however, creates processing and management difficulties, while the low quality generates uncertainties when investigating human activities. Based on the conception of the error distribution law and position accuracy of the GPS data, we propose in this paper a data cleaning method for this kind of spatial big data using movement consistency. First, a trajectory is partitioned into a set of sub-trajectories using the movement characteristic points. In this process, GPS points indicate that the motion status of the vehicle has transformed from one state into another, and are regarded as the movement characteristic points. Then, GPS data are cleaned based on the similarities of GPS points and the movement consistency model of the sub-trajectory. The movement consistency model is built using the random sample consensus algorithm based on the high spatial consistency of high-quality GPS data. The proposed method is evaluated based on extensive experiments, using GPS trajectories generated by a sample of vehicles over a 7-day period in Wuhan city, China. The results show the effectiveness and efficiency of the proposed method.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Enhancing Recall Using Data Cleaning for Biomedical Big Data
    Deshpande, Priya
    Rasin, Alexander
    Tchoua, Roselyne
    Furst, Jacob
    Raicu, Daniela A.
    Antani, Sameer
    2020 IEEE 33RD INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS(CBMS 2020), 2020, : 265 - 270
  • [2] Big Data Cleaning
    Tang, Nan
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 13 - 24
  • [3] A Big Data Cleaning Method for Drinking-Water Streaming Data
    Gai, Rong-Li
    Zhang, Hao
    Thanh, Dang Ngoc Hoang
    BRAZILIAN ARCHIVES OF BIOLOGY AND TECHNOLOGY, 2023, 66
  • [4] Data cleaning and restoring method for vehicle battery big data platform
    Li, Shuangqi
    He, Hongwen
    Zhao, Pengfei
    Cheng, Shuang
    APPLIED ENERGY, 2022, 320
  • [5] An Incorrect Data Detection Method for Big Data Cleaning of Machinery Condition Monitoring
    Xu, Xuefang
    Lei, Yaguo
    Li, Zeda
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2020, 67 (03) : 2326 - 2336
  • [6] Research on the Technology of Data Cleaning in Big Data
    Feng, Fu-jun
    Yao, Jun-ping
    Li, Xiao-jun
    2018 2ND INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELING AND SIMULATION (AMMS 2018), 2018, 305 : 176 - 181
  • [7] Big RDF Data Cleaning
    Tang, Nan
    2015 13TH IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2015, : 77 - 79
  • [8] A Road Map Refinement Method Using Delaunay Triangulation for Big Trace Data
    Tang, Luliang
    Ren, Chang
    Liu, Zhang
    Li, Qingquan
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2017, 6 (02)
  • [9] Data Cleaning Optimization for Grain Big Data Processing using Task Merging
    Ju, Xingang
    Lian, Feiyu
    Zhang, Yuan
    2019 6TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2019), 2019, : 225 - 233
  • [10] Modifying Cleaning Method in Big Data Analytics Process using Random Forest Classifier
    Hossen, J.
    Jesmeen, M. Z. H.
    Sayeed, Shohel
    PROCEEDINGS OF THE 2018 7TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING (ICCCE), 2018, : 208 - 213