A Distributed Framework for Online Stream Data Clustering

被引:0
|
作者
Ding, Jiafeng [1 ]
Fang, Junhua [1 ]
Chao, Pingfu [2 ]
Xu, Jiajie [1 ]
Zhao, PengPeng [1 ]
Zhao, Lei [1 ]
机构
[1] Soochow Univ, Dept Comp Sci & Technol, Suzhou, Peoples R China
[2] Univ Queensland, Brisbane, Qld, Australia
关键词
Real-time cluster analysis; Distributed stream processing; Spatial-temporal data mining; Top-k query; Parallel computing; DBSCAN ALGORITHM; EVOLUTION; SCALE;
D O I
10.1007/978-3-030-60245-1_13
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The recent prevalence of positioning sensors and mobile devices generates a massive amount of spatial-temporal data from moving objects in real-time. As one of the fundamental processes in data analysis, the clustering on spatial-temporal data creates various applications, like event detection and travel pattern extraction. However, most of the existing works only focus on the offline scenario, which is not applicable to online time-sensitive applications due to their low efficiency and ignorance of temporal features. In this paper, we propose a distributed streaming framework for spatial-temporal data clustering, which accepts various clustering algorithms while ensuring low resource consumption and result correctness. The framework includes a dynamic partitioning strategy for continuous load-balancing and a cluster-merging algorithm based on convex hulls [10], which guarantees the result correctness. Extensive experiments on real dataset prove the effectiveness of our proposed framework and its advantage over existing solutions.
引用
收藏
页码:190 / 204
页数:15
相关论文
共 50 条
  • [31] WSRF-based computing framework of distributed data stream queries
    Liu, JW
    Le, JJ
    [J]. GRID AND COOPERATIVE COMPUTING GCC 2004, PROCEEDINGS, 2004, 3251 : 951 - 954
  • [32] User online behavior based on big data distributed clustering algorithm
    Wang, Yan
    [J]. INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2020, 17 (02):
  • [33] Online monitor framework for network distributed data acquisition systems
    Konno, Tomoyuki
    Cabrera, Anatael
    Ishitsuka, Masaki
    Kuze, Masahiro
    Sakamoto, Yasunobu
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON TECHNOLOGY AND INSTRUMENTATION IN PARTICLE PHYSICS (TIPP 2011), 2012, 37 : 1835 - 1840
  • [34] Data stream clustering: a review
    Zubaroglu, Alaettin
    Atalay, Volkan
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (02) : 1201 - 1236
  • [35] Recent trends in distributed online stream processing platform for big data: Survey
    Ali, Ahmed Hussein
    Abdullah, Mahmood Zaki
    [J]. 2018 1ST ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION AND SCIENCES (AICIS 2018), 2018, : 140 - 145
  • [36] Data Stream Clustering: A Survey
    Silva, Jonathan A.
    Faria, Elaine R.
    Barros, Rodrigo C.
    Hruschka, Eduardo R.
    de Carvalho, Andre C. P. L. F.
    Gama, Joao
    [J]. ACM COMPUTING SURVEYS, 2013, 46 (01)
  • [37] Data stream clustering: a review
    Alaettin Zubaroğlu
    Volkan Atalay
    [J]. Artificial Intelligence Review, 2021, 54 : 1201 - 1236
  • [38] SubtStream: Online subtractive stream clustering algorithm
    Milli, Musa
    Bulut, Hasan
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (15):
  • [39] i-CODAS: An Improved Online Data Stream Clustering in Arbitrary Shaped Clusters
    Islam, Md Kamrul
    Ahmed, Md Manjur
    Zamli, Kamal Zuhairi
    [J]. ENGINEERING LETTERS, 2019, 27 (04) : 752 - 762
  • [40] Online Learning for Foot Contact Detection of Legged Robot Based on Data Stream Clustering
    Liu, Qingyu
    Yuan, Bing
    Wang, Yang
    [J]. FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2022, 9