Cost-effective and adaptive clustering algorithm for stream processing on cloud system

被引:0
|
作者
Yue Xia
Junhua Fang
Pingfu Chao
Zhicheng Pan
Jedi S. Shang
机构
[1] Soochow University,School of Computer Science and Technology
[2] The University of Queensland,School of Information Technology and Electrical Engineering
[3] Thinvent Technology Co. LTD.,undefined
来源
GeoInformatica | 2023年 / 27卷
关键词
Real-time processing; Density-based clustering; Window model; Time interval; Cluster evolution;
D O I
暂无
中图分类号
学科分类号
摘要
Clustering is a fundamental operation that plays an essential role in data management and analysis. Clustering algorithms have been well studied over the past two decades, but the real-time clustering has yet to be maturely applied. For applications based on clustering calculations, capturing the dynamic changes of clusters and trends of moving objects in a real-time manner can maximize the value of the data. Although the DSPE (D istributed S tream P rocessing E ngine) is capable of such workloads, it still faces the problems of fixed window size and computational resources waste. In this paper, we introduce a new C ost-e ffective and A daptive C lustering method (CeAC), which can improve computational efficiency while ensuring the accuracy of the clustering result. Specifically, we design a composite window model which contains the latest data records and maintains historical states. To achieve a lightweight clustering, we propose a fully online clustering algorithm based on grid density, which can capture clusters with arbitrary shape and effectively handle outliers in parallel. We further introduce an adaptive calculation model to accelerate the clustering operation by shedding workload according to the incoming data characteristic. Experimental results show that the proposed method is accurate and efficient in real-time data stream clustering.
引用
收藏
页码:1 / 21
页数:20
相关论文
共 50 条
  • [31] An Adaptive Density Data Stream Clustering Algorithm
    Ding, Shifei
    Zhang, Jian
    Jia, Hongjie
    Qian, Jun
    COGNITIVE COMPUTATION, 2016, 8 (01) : 30 - 38
  • [32] A Novel Algorithm for Adaptive Data Stream Clustering
    Ansarifar, Farnaz
    Ahmadi, Ali
    26TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2018), 2018, : 1542 - 1546
  • [33] Cost-effective system management
    Schaller, S
    ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS VII (ADASS), 1998, 145 : 139 - 141
  • [34] ECOS: An efficient task-clustering based cost-effective aware scheduling algorithm for scientific workflows execution on heterogeneous cloud systems
    Dong, Minggang
    Fan, Lili
    Jing, Chao
    JOURNAL OF SYSTEMS AND SOFTWARE, 2019, 158
  • [36] A-DSP: An Adaptive Join Algorithm for Dynamic Data Stream on Cloud System
    Fang, Junhua
    Zhang, Rong
    Zhao, Yan
    Zheng, Kai
    Zhou, Xiaofang
    Zhou, Aoying
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (05) : 1861 - 1876
  • [37] COST-EFFECTIVE PROCESSING BY ATOMIC LAYER EPITAXY
    SUNTOLA, T
    THIN SOLID FILMS, 1993, 225 (1-2) : 96 - 98
  • [38] Cost-Effective PM Ti Compositions and Processing
    Bolzoni, L.
    TMS 2020 149TH ANNUAL MEETING & EXHIBITION SUPPLEMENTAL PROCEEDINGS, 2020, : 1649 - 1657
  • [39] Cost-Effective HPC Clustering For Computer Vision Applications
    Dietlmeier, Julia
    Begley, Sean
    Whelan, Paul F.
    2008 INTERNATIONAL MACHINE VISION AND IMAGE PROCESSING CONFERENCE, PROCEEDINGS, 2008, : 97 - 102
  • [40] Cost-Effective Traffic Scheduling For Cloud Resource Management
    Shareef, Zayd Ashraf
    Hussin, Masnida
    Abdullah, Azizol
    Muhammed, Abdullah
    2015 IEEE STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT (SCORED), 2015, : 189 - 194