Elastic Scaling of Stateful Operators Over Fluctuating Data Streams

被引:0
|
作者
Wu, Minghui [1 ]
Sun, Dawei [1 ]
Gao, Shang [2 ]
Li, Keqin [3 ]
Buyya, Rajkumar [4 ]
机构
[1] China Univ Geosci, Sch Informat Engn, Beijing 100083, Peoples R China
[2] Deakin Univ, Sch Informat Technol, Waurn Ponds, Vic 3216, Australia
[3] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
[4] Univ Melbourne, Sch Comp & Informat Syst, Cloud Comp & Distributed Syst CLOUDS Lab, Melbourne, Vic 3010, Australia
基金
中国国家自然科学基金;
关键词
Streams; Parallel processing; Topology; Resource management; Data models; Computational modeling; System performance; Distributed stream computing; operator parallelism; resource scaling; state management; stateful operator;
D O I
10.1109/TSC.2024.3436596
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Elastic scaling of parallel operators has emerged as a powerful approach to reduce response time in stream applications with fluctuating inputs. Many state-of-the-art works focus on stateless operators and change the operator parallelism from one aspect. They often lack efficient management of operator states and overlook the costs associated with resource over-provisioning. To overcome these limitations, we introduce Es-Stream for elastic scaling of stateful operators over fluctuating data streams, which includes: 1) We observe that under-provisioning of operator parallelism leads to data pile-up, resulting in longer system latency, while over-provisioning of operator parallelism causes idle instances and additional resource consumption. 2) The Es-Stream system scales in two dimensions: the parallelism of operators and the number of resources. It dynamically adjusts operators to an optimal parallelism while scaling the resources used by the stream application. 3) When the parallelism of stateful operators changes, upstream operators backup downstream operators' state and cache the emitted data tuples at dynamic time intervals, ensuring the operator parallelism is adjusted in a low-overhead way. 4) Experimental results demonstrate that Es-Stream provides promising performance improvements, reducing the maximum system latency by 3x and saving the maximum state recovery time by 2x, compared to existing state-of-the-art works.
引用
收藏
页码:3555 / 3568
页数:14
相关论文
共 50 条
  • [41] Sketching Linear Classifiers over Data Streams
    Tai, Kai Sheng
    Sharan, Vatsal
    Bailis, Peter
    Valiant, Gregory
    SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 757 - 772
  • [42] PROBABILISTIC QUERYING OVER UNCERTAIN DATA STREAMS
    Dezfuli, Mohammad G.
    Haghjoo, Mostafa S.
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2012, 20 (05) : 701 - 728
  • [43] Enabling Signal Processing over Data Streams
    Nikolic, Milos
    Chandramouli, Badrish
    Goldstein, Jonathan
    SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2017, : 95 - 108
  • [44] Outlier detection over data streams: Survey
    Brahmi Z.
    Souiden I.
    International Journal of Business Intelligence and Data Mining, 2021, 19 (04) : 481 - 507
  • [45] An Information Divergence Estimation over Data Streams
    Anceaume, Emmanuelle
    Busnel, Yann
    2012 11TH IEEE INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA), 2012, : 28 - 35
  • [46] Accelerating ELM training over data streams
    Ji, Hangxu
    Wu, Gang
    Wang, Guoren
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (01) : 87 - 102
  • [47] Constrained Skyline Computing over Data Streams
    Lin, Jin-xian
    Wei, Jing-jing
    PROCEEDINGS OF THE ICEBE 2008: IEEE INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING, 2008, : 155 - +
  • [48] Skyline queries over incomplete data streams
    Weilong Ren
    Xiang Lian
    Kambiz Ghazinour
    The VLDB Journal, 2019, 28 : 961 - 985
  • [49] Efficient aggregate computation over data streams
    Nagaraj, Kanthi
    Naidu, K. V. M.
    Rastogi, Rajeev
    Satkin, Scott
    2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 1382 - +
  • [50] Attribute Outlier Detection over Data Streams
    Cao, Hui
    Zhou, Yongluan
    Shou, Lidan
    Chen, Gang
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT II, PROCEEDINGS, 2010, 5982 : 216 - +