Elastic Scaling of Stateful Operators Over Fluctuating Data Streams

被引:0
|
作者
Wu, Minghui [1 ]
Sun, Dawei [1 ]
Gao, Shang [2 ]
Li, Keqin [3 ]
Buyya, Rajkumar [4 ]
机构
[1] China Univ Geosci, Sch Informat Engn, Beijing 100083, Peoples R China
[2] Deakin Univ, Sch Informat Technol, Waurn Ponds, Vic 3216, Australia
[3] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
[4] Univ Melbourne, Sch Comp & Informat Syst, Cloud Comp & Distributed Syst CLOUDS Lab, Melbourne, Vic 3010, Australia
基金
中国国家自然科学基金;
关键词
Streams; Parallel processing; Topology; Resource management; Data models; Computational modeling; System performance; Distributed stream computing; operator parallelism; resource scaling; state management; stateful operator;
D O I
10.1109/TSC.2024.3436596
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Elastic scaling of parallel operators has emerged as a powerful approach to reduce response time in stream applications with fluctuating inputs. Many state-of-the-art works focus on stateless operators and change the operator parallelism from one aspect. They often lack efficient management of operator states and overlook the costs associated with resource over-provisioning. To overcome these limitations, we introduce Es-Stream for elastic scaling of stateful operators over fluctuating data streams, which includes: 1) We observe that under-provisioning of operator parallelism leads to data pile-up, resulting in longer system latency, while over-provisioning of operator parallelism causes idle instances and additional resource consumption. 2) The Es-Stream system scales in two dimensions: the parallelism of operators and the number of resources. It dynamically adjusts operators to an optimal parallelism while scaling the resources used by the stream application. 3) When the parallelism of stateful operators changes, upstream operators backup downstream operators' state and cache the emitted data tuples at dynamic time intervals, ensuring the operator parallelism is adjusted in a low-overhead way. 4) Experimental results demonstrate that Es-Stream provides promising performance improvements, reducing the maximum system latency by 3x and saving the maximum state recovery time by 2x, compared to existing state-of-the-art works.
引用
收藏
页码:3555 / 3568
页数:14
相关论文
共 50 条
  • [21] Aggregate computation over data streams
    Lin, Xuemin
    Zhang, Ying
    PROGRESS IN WWW RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2008, 4976 : 10 - 25
  • [22] Rethinking elastic online scheduling of big data streaming applications over high-velocity continuous data streams
    Dawei Sun
    Hongbin Yan
    Shang Gao
    Xunyun Liu
    Rajkumar Buyya
    The Journal of Supercomputing, 2018, 74 : 615 - 636
  • [23] Rethinking elastic online scheduling of big data streaming applications over high-velocity continuous data streams
    Sun, Dawei
    Yan, Hongbin
    Gao, Shang
    Liu, Xunyun
    Buyya, Rajkumar
    JOURNAL OF SUPERCOMPUTING, 2018, 74 (02): : 615 - 636
  • [24] Adaptive Multicast Tree Construction for Elastic Data Streams
    Zhu, Ying
    Pu, Ken Q.
    GLOBECOM 2008 - 2008 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, 2008,
  • [25] Optimized Elastic Query Mesh for Cloud Data Streams
    Mohamed, Fatma
    Ismail, Rasha M.
    Badr, Nagwa L.
    Tolba, M. F.
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2015, PT I, 2015, 9155 : 367 - 381
  • [26] Range counting over multidimensional data streams
    Suri, Subhash
    Toth, Csaba D.
    Zhou, Yunhong
    DISCRETE & COMPUTATIONAL GEOMETRY, 2006, 36 (04) : 633 - 655
  • [27] Practical Range Counting over Data Streams
    Bai, Ran
    Lai, Ziliang
    Lo, Eric
    Hon, Wing-Kai
    Zhang, Pengfei
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 659 - 668
  • [28] ASSOCIATION RULE HIDING OVER DATA STREAMS
    Gunay, Ufuk
    Gundem, Taflan Imre
    INFORMATION TECHNOLOGY AND CONTROL, 2009, 38 (02): : 125 - 134
  • [29] Adaptive clusters and histograms over data streams
    Puttagunta, V
    Kalpakis, K
    IKE '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE ENGINEERING, 2005, : 98 - 104
  • [30] Dynamic Sketching over Distributed Data Streams
    Wu, Guangjun
    Jia, Siyu
    Li, Binbin
    Wang, Shupeng
    Bao, Xiuguo
    Yuan, Qingsheng
    2016 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2016,