SLA-Based Adaptation Schemes in Distributed Stream Processing Engines

被引:3
|
作者
Hanif, Muhammad [1 ]
Kim, Eunsam [2 ]
Helal, Sumi [3 ]
Lee, Choonhwa [1 ]
机构
[1] Hanyang Univ, Div Comp Sci & Engn, Seoul 133791, South Korea
[2] Hongik Univ, Dept Comp Engn, Seoul 121791, South Korea
[3] Univ Lancaster, Sch Comp & Commun, Lancaster, England
来源
APPLIED SCIENCES-BASEL | 2019年 / 9卷 / 06期
基金
新加坡国家研究基金会;
关键词
big data; distributed computing; modern stream processing engine; SLA; watermarking; cloud computing; MODEL;
D O I
10.3390/app9061045
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
With the upswing in the volume of data, information online, and magnanimous cloud applications, big data analytics becomes mainstream in the research communities in the industry as well as in the scholarly world. This prompted the emergence and development of real-time distributed stream processing frameworks, such as Flink, Storm, Spark, and Samza. These frameworks endorse complex queries on streaming data to be distributed across multiple worker nodes in a cluster. Few of these stream processing frameworks provides fundamental support for controlling the latency and throughput of the system as well as the correctness of the results. However, none has the ability to handle them on the fly at runtime. We present a well-informed and efficient adaptive watermarking and dynamic buffering timeout mechanism for the distributed streaming frameworks. It is designed to increase the overall throughput of the system by making the watermarks adaptive towards the stream of incoming workload, and scale the buffering timeout dynamically for each task tracker on the fly while maintaining the Service Level Agreement (SLA)-based end-to-end latency of the system. This work focuses on tuning the parameters of the system (such as window correctness, buffering timeout, and so on) based on the prediction of incoming workloads and assesses whether a given workload will breach an SLA using output metrics including latency, throughput, and correctness of both intermediate and final results. We used Apache Flink as our testbed distributed processing engine for this work. However, the proposed mechanism can be applied to other streaming frameworks as well. Our results on the testbed model indicate that the proposed system outperforms the status quo of stream processing. With the inclusion of learning models like naive Bayes, multilayer perceptron (MLP), and sequential minimal optimization (SMO)., the system shows more progress in terms of keeping the SLA intact as well as quality of service (QoS).
引用
收藏
页数:21
相关论文
共 50 条
  • [1] An Adaptive SLA-Based Data Flow Mechanism for Stream Processing Engines
    Hanif, Muhammad
    Yoon, Hyungduk
    Jang, Sunglim
    Lee, Choonhwa
    2017 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2017, : 81 - 86
  • [2] Optimal Quality Adaptation for SLA-Based VoIP
    Su, Hui-Kai
    Wu, Chien-Min
    Yang, Ming-Ta
    2009 INTERNATIONAL CONFERENCE ON NEW TRENDS IN INFORMATION AND SERVICE SCIENCE (NISS 2009), VOLS 1 AND 2, 2009, : 734 - +
  • [3] Duality-Based Locality-Aware Stream Partitioning in Distributed Stream Processing Engines
    Son, Siwoon
    Moon, Yang-Sae
    EURO-PAR 2019: PARALLEL PROCESSING WORKSHOPS, 2020, 11997 : 725 - 730
  • [4] SLA-Based Management of Software Licenses as Web Service Resources in Distributed Environments
    Cacciari, Claudio
    Mallmann, Daniel
    Zsigri, Csilla
    D'Andria, Francesco
    Hagemeier, Bjoern
    Rumpl, Angela
    Ziegler, Wolfgang
    Martrat, Josep
    ECONOMICS OF GRIDS, CLOUDS, SYSTEMS, AND SERVICES, 2010, 6296 : 78 - +
  • [5] SLA-Based Reputation Life Cycle
    Hamadache, Kahina
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2013 CONFERENCES, 2013, 8185 : 377 - 394
  • [6] Usage SLA-based scheduling in grids
    Dumitrescu, Catalin L.
    Raicu, Ioan
    Foster, Ian
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2007, 19 (07): : 945 - 963
  • [7] An SLA-based Broker for Cloud Infrastructures
    Antonio Cuomo
    Giuseppe Di Modica
    Salvatore Distefano
    Antonio Puliafito
    Massimiliano Rak
    Orazio Tomarchio
    Salvatore Venticinque
    Umberto Villano
    Journal of Grid Computing, 2013, 11 : 1 - 25
  • [8] How to Measure Scalability of Distributed Stream Processing Engines?
    Henning, Soeren
    Hasselbring, Wilhelm
    COMPANION OF THE ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE 2021, 2021, : 85 - 88
  • [9] A Backpressure Mitigation Scheme in Distributed Stream Processing Engines
    Hanif, Muhammad
    Yoon, Hyeongdeok
    Lee, Choonhwa
    2020 34TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2020), 2020, : 713 - 716
  • [10] Benchmarking Tool for Modern Distributed Stream Processing Engines
    Hanif, Muhammad
    Yoon, Hyeongdeok
    Lee, Choonhwa
    33RD INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2019), 2019, : 393 - 395