Labeling Big Spatial Data: A Case Study of New York Taxi Limousine Dataset

被引:0
|
作者
AlBatati, Fawaz [1 ]
Alarabi, Louai [1 ]
机构
[1] Umm Al Qura Univ, Coll Comp & Informat Syst, Dept Comp Sci, Mecca, Saudi Arabia
关键词
Unsupervised Learning; K-means Clustering Algorithm; Unlabeled data; Spatial-data; Trajectory;
D O I
10.22937/IJCSNS.2021.21.6.27
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering Unlabeled Spatial-datasets to convert them to Labeled Spatial-datasets is a challenging task specially for geographical information systems. In this research study we investigated the NYC Taxi Limousine Commission dataset and discover that all of the spatial-temporal trajectory are unlabeled Spatial-datasets, which is in this case it is not suitable for any data mining tasks, such as classification and regression. Therefore, it is necessary to convert unlabeled Spatial-datasets into labeled Spatial-datasets. In this research study we are going to use the Clustering Technique to do this task for all the Trajectory datasets. A key difficulty for applying machine learning classification algorithms for many applications is that they require a lot of labeled datasets. Labeling a Big-data in many cases is a costly process. In this paper, we show the effectiveness of utilizing a Clustering Technique for labeling spatial data that leads to a high-accuracy classifier.
引用
下载
收藏
页码:207 / 212
页数:6
相关论文
共 50 条
  • [41] Reference-dependent preferences and labor supply: The case of New York City taxi drivers
    Farber, Henry S.
    AMERICAN ECONOMIC REVIEW, 2008, 98 (03): : 1069 - 1082
  • [42] Investigating Public Facility Characteristics from a Spatial Interaction Perspective: A Case Study of Beijing Hospitals Using Taxi Data
    Kong, Xiaoqing
    Liu, Yu
    Wang, Yuxia
    Tong, Daoqin
    Zhang, Jing
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2017, 6 (02):
  • [43] Estimating spatio-temporal variations of taxi ridership caused by Hurricanes Irene and Sandy: A case study of New York City
    Bian, Ruijie
    Wilmot, Chester G.
    Wang, Ling
    TRANSPORTATION RESEARCH PART D-TRANSPORT AND ENVIRONMENT, 2019, 77 : 627 - 638
  • [44] Discovering spatial and temporal patterns from taxi-based Floating Car Data: a case study from Nanjing
    Shen, Jingwei
    Liu, Xintao
    Chen, Min
    GISCIENCE & REMOTE SENSING, 2017, 54 (05) : 617 - 638
  • [45] Exploring spatio-temporal impact of COVID-19 on citywide taxi demand: A case study of New York City
    Zhang, Yanan
    Sui, Xueliang
    Zhang, Shen
    PLOS ONE, 2024, 19 (04):
  • [46] Big Data Management: A Case Study on Medical Data
    Sulea, Vlad
    Ciuciu, Ioana
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS, OTM 2019, 2020, 11878 : 194 - 198
  • [47] Big Data for New Industrialization and Urbanization Development A Case Study in Chinese Cities
    Sharma, Gajendra
    NEW INDUSTRIALIZATION AND URBANIZATION DEVELOPMENT ANNUAL CONFERENCE: THE INTERNATIONAL FORUM ON NEW INDUSTRIALIZATION DEVELOPMENT IN BIG-DATA ERA, 2015, : 18 - 29
  • [48] The spatial effect of integrated economy on carbon emissions in the era of big data: a case study of China
    Wang, Yan
    Ke, Qian
    Lei, Shuzhen
    FRONTIERS IN ECOLOGY AND EVOLUTION, 2024, 12
  • [49] Security Threats for Big Data A Study on Enron E-mail Dataset
    Zaki, Tarannum
    Uddin, Md. Sami
    Hasan, Md. Mahedi
    Islam, Muhammad Nazrul
    2017 5TH INTERNATIONAL CONFERENCE ON RESEARCH AND INNOVATION IN INFORMATION SYSTEMS (ICRIIS 2017): SOCIAL TRANSFORMATION THROUGH DATA SCIENCE, 2017,
  • [50] Processing of Bathymetric Data: The Fusion of New Reduction Methods for Spatial Big Data
    Wlodarczyk-Sielicka, Marta
    Blaszczak-Bak, Wioleta
    SENSORS, 2020, 20 (21) : 1 - 22