Towards Low-Latency Batched Stream Processing by Pre-Scheduling

被引:10
|
作者
Jin, Hai [1 ]
Chen, Fei [1 ]
Wu, Song [1 ]
Yao, Yin [1 ]
Liu, Zhiyi [1 ]
Gu, Lin [1 ]
Zhou, Yongluan [2 ]
机构
[1] Huazhong Univ Sci & Technol, Serv Comp Technol & Syst Lab, Cluster & Grid Comp Lab, Sch Comp Sci & Technol, Wuhan 430074, Hubei, Peoples R China
[2] Univ Copenhagen, Dept Comp Sci, DK-1017 Copenhagen, Denmark
关键词
Stream processing; recurring jobs; straggler; scheduling; data assignment;
D O I
10.1109/TPDS.2018.2866581
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Many stream processing frameworks have been developed to meet the requirements of real-time processing. Among them, batched stream processing frameworks are widely advocated with the consideration of their fault-tolerance, high throughput and unified runtime with batch processing. In batched stream processing frameworks, straggler, happened due to the uneven task execution time, has been regarded as a major hurdle of latency-sensitive applications. Existing straggler mitigation techniques, operating in either reactive or proactive manner, are all post-scheduling methods, and therefore inevitably result in high resource overhead or long job completion time. We notice that batched stream processing jobs are usually recurring with predictable characteristics. By exploring such a heuristic, we present a pre-scheduling straggler mitigation framework called Lever. Lever first identifies potential stragglers and evaluates nodes' capacity by analyzing execution information of historical jobs. Then, Lever carefully pre-schedules job input data to each node before task scheduling so as to mitigate potential stragglers. We implement Lever and contribute it as an extension of Apache Spark Streaming. Our experimental results show that Lever can reduce job completion time by 30.72 to 42.19 percent over Spark Streaming, a widely adopted batched stream processing system and outperforms traditional techniques significantly.
引用
收藏
页码:710 / 722
页数:13
相关论文
共 50 条
  • [1] Lever: Towards Low-Latency Batched Stream Processing by Pre-Scheduling
    Chen, Fei
    Wu, Song
    Jin, Hai
    Yao, Yin
    Liu, Zhiyi
    Gu, Lin
    Zhou, Yongluan
    [J]. PROCEEDINGS OF THE 2017 SYMPOSIUM ON CLOUD COMPUTING (SOCC '17), 2017, : 643 - 643
  • [2] TurboStream: Towards Low-Latency Data Stream Processing
    Wu, Song
    Liu, Mi
    Ibrahim, Shadi
    Jin, Hai
    Gu, Lin
    Chen, Fei
    Liu, Zhiyi
    [J]. 2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2018, : 983 - 993
  • [3] Low-Latency Scheduling in MPTCP
    Hurtig, Per
    Grinnemo, Karl-Johan
    Brunstrom, Anna
    Ferlin, Simone
    Alay, Ozgu
    Kuhn, Nicolas
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (01) : 302 - 315
  • [4] PRE-SCHEDULING ALGORITHM - SCHEDULING A SUITABLE MIX PRIOR TO PROCESSING
    FORBES, K
    GOLDSWORTHY, AW
    [J]. COMPUTER JOURNAL, 1977, 20 (01): : 27 - 29
  • [5] A Distributed and Scalable Framework for Low-Latency Continuous Trajectory Stream Processing
    Shaikh, Salman Ahmed
    Kitagawa, Hiroyuki
    Matono, Akiyoshi
    Kim, Kyoung-Sook
    [J]. IEEE Access, 2024, 12 : 159426 - 159444
  • [6] Hazelcast Jet: Low-latency Stream Processing at the 99.99th Percentile
    Gencer, Can
    Topolnik, Marko
    Durina, Viliam
    Demirci, Emin
    Kahveci, Ensar B.
    Gurbuz, Ali
    Lukas, Ondrej
    Bartok, Jozsef
    Gierlach, Grzegorz
    Hartman, Frantisek
    Yilmaz, Ufuk
    Dogan, Mehmet
    Mandouh, Mohamed
    Fragkoulis, Marios
    Katsifodimos, Asterios
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (12): : 3110 - 3121
  • [7] Viper: Communication-Layer Determinism and Scaling in Low-Latency Stream Processing
    Walulya, Ivan
    Nikolakopoulos, Yiannis
    Gulisano, Vincenzo
    Papatriantafilou, Marina
    Tsigas, Philippas
    [J]. EURO-PAR 2017: PARALLEL PROCESSING WORKSHOPS, 2018, 10659 : 129 - 140
  • [8] Viper: A module for communication-layer determinism and scaling in low-latency stream processing
    Walulya, Ivan
    Palyvos-Giannas, Dimitris
    Nikolakopoulos, Yiannis
    Gulisano, Vincenzo
    Papatriantafilou, Marina
    Tsigas, Philippas
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 88 : 297 - 308
  • [9] Demo Abstract: Towards In-Network Processing for Low-Latency Industrial Control
    Rueth, Jan
    Glebke, Rene
    Ulmen, Tanja
    Wehrle, Klaus
    [J]. IEEE INFOCOM 2018 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2018,
  • [10] Radar: Reducing Tail Latencies for Batched Stream Processing with Blank Scheduling
    Chen, Fei
    Wu, Song
    Jin, Hai
    Lin, Liwei
    Li, Rui
    [J]. IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 797 - 804