Resource aware scheduler for distributed stream processing in cloud native environments

被引:2
|
作者
Sarathchandra, Madushi [1 ]
Karandana, Chulani [1 ]
Heenatigala, Winma [1 ]
Dayarathna, Miyuru [2 ]
Jayasena, Sanath [1 ]
机构
[1] Univ Moratuwa, Dept Comp Sci & Engn, Moratuwa, Sri Lanka
[2] WSO2 Inc, 787 Castro St, Mountain View, CA 94041 USA
来源
关键词
autoscaling; cloud computing; event‐ based systems; IaaS; Knapsack; Kubernetes; machine learning; scalability; software performance engineering; stream processing;
D O I
10.1002/cpe.6373
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recently distributed stream processors are increasingly being deployed in cloud computing infrastructures. In this article, we study performance characteristics of distributed stream processing applications in Google Compute Engine which is based on Kubernetes. We identify performance gaps in terms of throughput which appear in such environments when using a round robin (RR) scheduling algorithm. As a solution, we propose resource aware stream processing scheduler called resource aware scheduler for stream processing applications in cloud native environments (RaspaCN). We implement RaspaCN's job scheduler using two-step process. First, we use machine learning to identify the optimal number of worker nodes. Second, we use RR and multiple Knapsack algorithms to produce performance optimal stream processing job schedules. With three application benchmarks called HTTP Log Processor, Nexmark, and Email Processor representing real world stream processing scenarios we evaluate the performance benefits obtained via RaspaCN's scheduling algorithm. RaspaCN could produce percentage increase of average throughput values by at least 37%, 38%, and 10%, respectively, for HTTP Log Processor, Nexmark, and Email Processor benchmarks for fixed input data rates. Furthermore, we conduct experiments with varying input data rates as well and show 7% improved average throughput for HTTP Log Processor. These experiments show the effectiveness of our proposed stream processor job scheduler for producing improved performance.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] gSched: a resource aware Hadoop scheduler for heterogeneous cloud computing environments
    Caruana, Godwin
    Li, Maozhen
    Qi, Man
    Khan, Mukhtaj
    Rana, Omer
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (20):
  • [2] A Hibernation Aware Dynamic Scheduler for Cloud Environments
    Teylo, Luan
    Arantes, Luciana
    Sens, Pierre
    Drummond, Lucia Maria de A.
    [J]. PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPP 2019), 2019,
  • [3] An elastic and traffic-aware scheduler for distributed data stream processing in heterogeneous clusters
    Hadian, Hamid
    Farrokh, Mohammadreza
    Sharifi, Mohsen
    Jafari, Ali
    [J]. JOURNAL OF SUPERCOMPUTING, 2023, 79 (01): : 461 - 498
  • [4] An elastic and traffic-aware scheduler for distributed data stream processing in heterogeneous clusters
    Hamid Hadian
    Mohammadreza Farrokh
    Mohsen Sharifi
    Ali Jafari
    [J]. The Journal of Supercomputing, 2023, 79 : 461 - 498
  • [5] Resource-aware Stream Processing in High Performance Cloud Environment
    Cheng, Yingchao
    Hao, Zhifeng
    Cai, Ruichu
    Wen, Wen
    Wang, Lijuan
    Zhou, Zhongrun
    [J]. 2018 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2018, : 381 - 388
  • [6] RESCUE: An energy-aware scheduler for cloud environments
    Zhang, Quan
    Metri, Grace
    Raghavan, Sudharsan
    Shi, Weisong
    [J]. SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2014, 4 (04): : 215 - 224
  • [7] TOP-Storm: A topology-based resource-aware scheduler for Stream Processing Engine
    Asif Muhammad
    Muhammad Aleem
    Muhammad Arshad Islam
    [J]. Cluster Computing, 2021, 24 : 417 - 431
  • [8] TOP-Storm: A topology-based resource-aware scheduler for Stream Processing Engine
    Muhammad, Asif
    Aleem, Muhammad
    Islam, Muhammad Arshad
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2021, 24 (01): : 417 - 431
  • [9] ECSNeT plus plus : A simulator for distributed stream processing on edge and cloud environments
    Amarasinghe, Gayashan
    de Assuncao, Marcos D.
    Harwood, Aaron
    Karunasekera, Shanika
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 111 : 401 - 418
  • [10] Elastic Stream Processing for Distributed Environments
    Hochreiner, Christoph
    Schulte, Stefan
    Dustdar, Schahram
    Lecue, Freddy
    [J]. IEEE INTERNET COMPUTING, 2015, 19 (06) : 54 - 59