A Distributed Framework for Online Stream Data Clustering

被引:0
|
作者
Ding, Jiafeng [1 ]
Fang, Junhua [1 ]
Chao, Pingfu [2 ]
Xu, Jiajie [1 ]
Zhao, PengPeng [1 ]
Zhao, Lei [1 ]
机构
[1] Soochow Univ, Dept Comp Sci & Technol, Suzhou, Peoples R China
[2] Univ Queensland, Brisbane, Qld, Australia
关键词
Real-time cluster analysis; Distributed stream processing; Spatial-temporal data mining; Top-k query; Parallel computing; DBSCAN ALGORITHM; EVOLUTION; SCALE;
D O I
10.1007/978-3-030-60245-1_13
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The recent prevalence of positioning sensors and mobile devices generates a massive amount of spatial-temporal data from moving objects in real-time. As one of the fundamental processes in data analysis, the clustering on spatial-temporal data creates various applications, like event detection and travel pattern extraction. However, most of the existing works only focus on the offline scenario, which is not applicable to online time-sensitive applications due to their low efficiency and ignorance of temporal features. In this paper, we propose a distributed streaming framework for spatial-temporal data clustering, which accepts various clustering algorithms while ensuring low resource consumption and result correctness. The framework includes a dynamic partitioning strategy for continuous load-balancing and a cluster-merging algorithm based on convex hulls [10], which guarantees the result correctness. Extensive experiments on real dataset prove the effectiveness of our proposed framework and its advantage over existing solutions.
引用
收藏
页码:190 / 204
页数:15
相关论文
共 50 条
  • [1] DistClusTree: A Framework for Distributed Stream Clustering
    Hesabi, Zhinoos Razavi
    Sellis, Timos
    Liao, Kewen
    [J]. DATABASES THEORY AND APPLICATIONS, ADC 2018, 2018, 10837 : 288 - 299
  • [2] FuzzStream: Fuzzy Data Stream Clustering Based on the Online-Offline Framework
    Lopes, Priscilla de Abreu
    Camargo, Heloisa de Arruda
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2017,
  • [3] DistStream: An Order-Aware Distributed Framework for Online-Offline Stream Clustering Algorithms
    Xu, Lijie
    Ye, Xingtong
    Kang, Kai
    Guo, Tian
    Dou, Wensheng
    Wang, Wei
    Wei, Jun
    [J]. 2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2020, : 842 - 852
  • [4] A Tensor Framework for Data Stream Clustering and Compression
    Cyganek, Boguslaw
    Wozniak, Michal
    [J]. IMAGE ANALYSIS AND PROCESSING,(ICIAP 2017), PT I, 2017, 10484 : 163 - 173
  • [5] MOA: Massive Online Analysis, a Framework for Stream Classification and Clustering
    Bifet, Albert
    Holmes, Geoff
    Pfahringer, Bernhard
    Kranen, Philipp
    Kremer, Hardy
    Jansen, Timm
    Seidl, Thomas
    [J]. PROCEEDINGS OF THE FIRST WORKSHOP ON APPLICATIONS OF PATTERN ANALYSIS, 2010, 11 : 44 - 50
  • [6] A Prediction Framework for Distributed Data Stream Processing
    He ZhiYong
    Du RongHua
    [J]. PROCEEDINGS OF THE 2009 PACIFIC-ASIA CONFERENCE ON CIRCUITS, COMMUNICATIONS AND SYSTEM, 2009, : 179 - 183
  • [7] DS-Means: Distributed Data Stream Clustering
    Guerrieri, Alessio
    Montresor, Alberto
    [J]. EURO-PAR 2012 PARALLEL PROCESSING, 2012, 7484 : 260 - 271
  • [8] Online Clustering for Trajectory Data Stream of Moving Objects
    Yu, Yanwei
    Wang, Qin
    Wang, Xiaodong
    Wang, Huan
    He, Jie
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2013, 10 (03) : 1293 - 1317
  • [9] Introduction to stream: An Extensible Framework for Data Stream Clustering Research with R
    Hahsler, Michael
    Bolanos, Matthew
    Forrest, John
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2017, 76 (14): : 1 - 50
  • [10] Clustering Data Stream Under a Belief Function Framework
    Bahri, Maroua
    Elouedi, Zied
    [J]. 2016 IEEE/ACS 13TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2016,