Evaluation of Load Prediction Techniques for Distributed Stream Processing

被引:4
|
作者
Gontarska, Kordian [1 ,2 ]
Geldenhuys, Morgan [2 ]
Scheinert, Dominik [2 ]
Wiesner, Philipp [2 ]
Polze, Andreas [1 ]
Thamsen, Lauritz [2 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst, Potsdam, Germany
[2] Tech Univ Berlin, Berlin, Germany
关键词
Distributed Stream Processing; Resource Management and Optimization; Load Prediction; Time Series Forecasting; Machine Learning;
D O I
10.1109/IC2E52221.2021.00023
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Distributed Stream Processing (DSP) systems enable processing large streams of continuous data to produce results in near to real time. They are an essential part of many data-intensive applications and analytics platforms. The rate at which events arrive at DSP systems can vary considerably over time, which may be due to trends, cyclic, and seasonal patterns within the data streams. A priori knowledge of incoming workloads enables proactive approaches to resource management and optimization tasks such as dynamic scaling, live migration of resources, and the tuning of configuration parameters during run-times, thus leading to a potentially better Quality of Service. In this paper we conduct a comprehensive evaluation of different load prediction techniques for DSP jobs. We identify three use-cases and formulate requirements for making load predictions specific to DSP jobs. Automatically optimized classical and Deep Learning methods are being evaluated on nine different datasets from typical DSP domains, i.e. the IoT, Web 2.0, and cluster monitoring. We compare model performance with respect to overall accuracy and training duration. Our results show that the Deep Learning methods provide the most accurate load predictions for the majority of the evaluated datasets.
引用
收藏
页码:91 / 98
页数:8
相关论文
共 50 条
  • [31] Elastic Stream Processing for Distributed Environments
    Hochreiner, Christoph
    Schulte, Stefan
    Dustdar, Schahram
    Lecue, Freddy
    IEEE INTERNET COMPUTING, 2015, 19 (06) : 54 - 59
  • [32] Distributed Data Stream Processing with Onix
    Shtykh, Roman Y.
    Suzuki, Toshihiro
    2014 IEEE FOURTH INTERNATIONAL CONFERENCE ON BIG DATA AND CLOUD COMPUTING (BDCLOUD), 2014, : 267 - 268
  • [33] Network traffic prediction platform based on distributed stream processing and supported by R language
    Wang Yu
    Yuan Yan
    Wu Shui-qing
    Li Yan-chao
    2018 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2018, 10836
  • [34] A Low-Load Distributed Stream Processing System for Continuous Conjunctive Normal Form Queries
    Yoshihisa, Tomoki
    Hara, Takahiro
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2022, 10 (04) : 2281 - 2293
  • [35] Signal processing challenges in distributed stream processing systems
    Frossard, Pascal
    Verscheure, Olivier
    Venkatramani, Chitra
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 5903 - 5906
  • [36] Modeling Data Stream Intensity in Distributed Stream Processing System
    Gorawski, Marcin
    Marks, Pawel
    Gorawski, Michal
    COMPUTER NETWORKS, CN 2013, 2013, 370 : 372 - 383
  • [37] Performance Prediction Techniques for Scalable Large Data Processing in Distributed MPI Systems
    Bhimani, Janki
    Mi, Ningfang
    Leeser, Miriam
    2016 IEEE 35TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2016,
  • [38] Processing Partially Ordered Requests in Distributed Stream Processing Systems
    Cai, Rijun
    Wu, Weigang
    Huang, Ning
    Wu, Lihui
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2016, 2016, 10048 : 211 - 219
  • [39] Stateful Load Balancing for Parallel Stream Processing
    Guo, Qingsong
    Zhou, Yongluan
    EURO-PAR 2017: PARALLEL PROCESSING WORKSHOPS, 2018, 10659 : 80 - 93
  • [40] Optimizing distributed data stream processing by tracing
    Zvara, Zoltan
    Szabo, Peter G. N.
    Balazs, Barnabas
    Benczur, Andras
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 90 : 578 - 591