Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming

被引:187
|
作者
Chintapalli, Sanket [1 ]
Dagit, Derek [1 ]
Evans, Bobby [1 ]
Farivar, Reza [1 ]
Graves, Thomas [1 ]
Holderbaugh, Mark [1 ]
Liu, Zhuo [1 ]
Nusbaum, Kyle [1 ]
Patil, Kishorkumar [1 ]
Peng, Boyang Jerry [1 ]
Poulosky, Paul [1 ]
机构
[1] Yahoo Inc, Sunnyvale, CA 94089 USA
关键词
Streaming processing; Benchmark; Storm; Spark; Flink; Low Latency;
D O I
10.1109/IPDPSW.2016.138
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Streaming data processing has been gaining attention due to its application into a wide range of scenarios. To serve the booming demands of streaming data processing, many computation engines have been developed. However, there is still a lack of real-world benchmarks that would be helpful when choosing the most appropriate platform for serving real-time streaming needs. In order to address this problem, we developed a streaming benchmark for three representative computation engines: Flink, Storm and Spark Streaming. Instead of testing speed-of-light event processing, we construct a full data pipeline using Kafka and Redis in order to more closely mimic the real-world production scenarios. Based on our experiments, we provide a performance comparison of the three data engines in terms of 99th percentile latency and throughput for various configurations.
引用
收藏
页码:1789 / 1792
页数:4
相关论文
共 50 条
  • [1] Flink和Spark Streaming流式计算模型比较分析
    宋灵城
    通信技术, 2020, 53 (01) : 59 - 62
  • [2] Continuous outlier mining of streaming data in flink
    Toliopoulos, Theodoros
    Gounaris, Anastasios
    Tsichlas, Kostas
    Papadopoulos, Apostolos
    Sampaio, Sandra
    INFORMATION SYSTEMS, 2020, 93 (93)
  • [3] MotionInsights: Object Tracking in Streaming Video with Apache Flink
    Banelas, Dimitrios
    Petrakis, Euripides G. M.
    ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 1, AINA 2024, 2024, 199 : 402 - 414
  • [4] Streaming in thermoacoustic engines and refrigerators
    Swift, GW
    NONLINEAR ACOUSTICS AT THE TURN OF THE MILLENNIUM, 2000, 524 : 105 - 114
  • [5] Modeling and Simulation of Spark Streaming
    Lin, Jia-Chun
    Lee, Ming-Chang
    Yu, Ingrid Chieh
    Johnsen, Einar Broch
    PROCEEDINGS 2018 IEEE 32ND INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2018, : 407 - 413
  • [6] A Framework for Similarity Search in Streaming Time Series based on Spark Streaming
    Bui Cong Giao
    Phan Cong Vinh
    MOBILE NETWORKS & APPLICATIONS, 2022, 27 (05): : 2084 - 2097
  • [7] A Framework for Similarity Search in Streaming Time Series based on Spark Streaming
    Bui Cong Giao
    Phan Cong Vinh
    Mobile Networks and Applications, 2022, 27 : 2084 - 2097
  • [8] Streaming Massive Electric Power Data Analysis Based on Spark Streaming
    Zhang, Xudong
    Qian, Zhongwen
    Shen, Siqi
    Shi, Jia
    Wang, Shujun
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 200 - 212
  • [9] Benchmarking Modern Distributed Streaming Platforms
    Qian, Shilei
    Wu, Gang
    Huang, Jie
    Das, Tathagata
    PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2016, : 592 - 598
  • [10] Streaming computation of combinatorial objects
    Bar-Yossef, Z
    Reingold, O
    Shaltiel, R
    Trevisan, L
    17TH ANNUAL IEEE CONFERENCE ON COMPUTATIONAL COMPLEXITY, PROCEEDINGS, 2002, : 165 - 174