Automatic Performance Tuning for Distributed Data Stream Processing Systems

被引:7
|
作者
Herodotou, Herodotos [1 ]
Odysseos, Lambros [1 ]
Chen, Yuxing [2 ]
Lu, Jiaheng [3 ]
机构
[1] Cyprus Univ Technol, Limassol, Cyprus
[2] Tencent Inc, Shenzhen, Peoples R China
[3] Univ Helsinki, Helsinki, Finland
关键词
Performance tuning; data stream processing; parameter tuning; Storm; Flink; Spark Streaming;
D O I
10.1109/ICDE53745.2022.00296
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distributed data stream processing systems (DSPSs) such as Storm, Flink, and Spark Streaming are now routinely used to process continuous data streams in (near) real-time. However, achieving the low latency and high throughput demanded by today's streaming applications can be a daunting task, especially since the performance of DSPSs highly depends on a large number of system parameters that control load balancing, degree of parallelism, buffer sizes, and various other aspects of system execution. This tutorial offers a comprehensive review of the state-of-the-art automatic performance tuning approaches that have been proposed in recent years. The approaches are organized into five main categories based on their methodologies and features: cost modeling, simulation-based, experiment-driven, machine learning, and adaptive tuning. The categories of approaches will be analyzed in depth and compared to each other, exposing their various strengths and weaknesses. Finally, we will identify several open research problems and challenges related to automatic performance tuning for DSPSs.
引用
收藏
页码:3194 / 3197
页数:4
相关论文
共 50 条
  • [31] Distributed resource allocation for stream data processing
    Tang, Ao
    Liu, Zhen
    Xia, Cathy
    Zhang, Li
    HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2006, 4208 : 91 - 100
  • [32] Accommodating Bursts in Distributed Stream Processing Systems
    Drougas, Yannis
    Kalogeraki, Vana
    2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 362 - 372
  • [33] Rethinking the design of distributed stream processing systems
    Zhou, Yongluan
    Aberer, Karl
    Salehi, Ali
    Tan, Kian-Lee
    2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1 AND 2, 2008, : 182 - +
  • [34] Distributed resource allocation in stream processing systems
    Xia, Cathy H.
    Broberg, James A.
    Liu, Zhen
    Zhang, Li
    Distributed Computing, Proceedings, 2006, 4167 : 489 - 504
  • [35] Processing Partially Ordered Requests in Distributed Stream Processing Systems
    Cai, Rijun
    Wu, Weigang
    Huang, Ning
    Wu, Lihui
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2016, 2016, 10048 : 211 - 219
  • [36] Exploring System and Machine Learning Performance Interactions when Tuning Distributed Data Stream Applications
    Odysseos, Lambros
    Herodotou, Herodotos
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW 2022), 2022, : 24 - 29
  • [37] AutoConfig: Automatic Configuration Tuning for Distributed Message Systems
    Bao, Liang
    Liu, Xin
    Xu, Ziheng
    Fang, Baoyin
    PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18), 2018, : 29 - 40
  • [38] TDAG: A Tunable Distributed Data Processing Model for Data Stream
    Tang, Jintao
    Lin, Xuelian
    Shen, Yang
    Wo, Tianyu
    2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 433 - 437
  • [39] Performance Analysis of Continuous Binary Data Processing using Distributed Databases within Stream Processing Environments
    Weissbach, Manuel
    Hilbert, Hannes
    Springer, Thomas
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE (CLOSER), 2020, : 138 - 149
  • [40] Load Adaptive Distributed Stream Processing System for Explosive Stream Data
    Lee, Myungcheol
    Lee, Miyoung
    Hur, Sung Jin
    Kim, Ikkyun
    2015 17TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT), 2015, : 753 - 757