Online Nonstop Task Management for Storm-Based Distributed Stream Processing Engines

被引:0
|
作者
Zhang, Zhou [1 ,2 ]
Jin, Pei-Quan [1 ,2 ]
Xie, Xi-Ke [1 ]
Wang, Xiao-Liang [1 ,2 ]
Liu, Rui-Cheng [1 ,2 ]
Wan, Shou-Hong [1 ,2 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230026, Peoples R China
[2] Chinese Acad Sci, Key Lab Electromagnet Space Informat, Hefei 230026, Peoples R China
基金
中国国家自然科学基金;
关键词
distributed stream processing engine (DSPE); Apache Storm; online task migration; online task deployment; REAL-TIME;
D O I
10.1007/s11390-021-1629-9
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Most distributed stream processing engines (DSPEs) do not support online task management and cannot adapt to time-varying data flows. Recently, some studies have proposed online task deployment algorithms to solve this problem. However, these approaches do not guarantee the Quality of Service (QoS) when the task deployment changes at runtime, because the task migrations caused by the change of task deployments will impose an exorbitant cost. We study one of the most popular DSPEs, Apache Storm, and find out that when a task needs to be migrated, Storm has to stop the resource (implemented as a process of Worker in Storm) where the task is deployed. This will lead to the stop and restart of all tasks in the resource, resulting in the poor performance of task migrations. Aiming to solve this problem, in this paper, we propose N-Storm (Nonstop Storm), which is a task-resource decoupling DSPE. N-Storm allows tasks allocated to resources to be changed at runtime, which is implemented by a thread-level scheme for task migrations. Particularly, we add a local shared key/value store on each node to make resources aware of the changes in the allocation plan. Thus, each resource can manage its tasks at runtime. Based on N-Storm, we further propose Online Task Deployment (OTD). Differing from traditional task deployment algorithms that deploy all tasks at once without considering the cost of task migrations caused by a task re-deployment, OTD can gradually adjust the current task deployment to an optimized one based on the communication cost and the runtime states of resources. We demonstrate that OTD can adapt to different kinds of applications including computation- and communication-intensive applications. The experimental results on a real DSPE cluster show that N-Storm can avoid the system stop and save up to 87% of the performance degradation time, compared with Apache Storm and other state-of-the-art approaches. In addition, OTD can increase the average CPU usage by 51% for computation-intensive applications and reduce network communication costs by 88% for communication-intensive applications.
引用
收藏
页码:116 / 138
页数:23
相关论文
共 50 条
  • [41] Online and energy-efficient task-processing for distributed edge networks
    Yu, Li
    Li, Zongpeng
    Liu, Jiangchuan
    Zhou, Ruiting
    COMPUTER NETWORKS, 2021, 193
  • [42] Control-based scheduling in a distributed stream processing system
    Khorlin, Andrey
    Chandy, K. Mani
    SCW 2006: IEEE SERVICES COMPUTING WORKSHOPS, PROCEEDINGS, 2006, : 55 - +
  • [43] An Analysis on Task Migration Strategy of Big Data Streaming Storm Computing Framework for Distributed Processing
    Hu, Xiling
    INTERNATIONAL JOURNAL OF INFORMATION SYSTEM MODELING AND DESIGN, 2020, 11 (04) : 18 - 35
  • [44] An Adaptive SLA-Based Data Flow Mechanism for Stream Processing Engines
    Hanif, Muhammad
    Yoon, Hyungduk
    Jang, Sunglim
    Lee, Choonhwa
    2017 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2017, : 81 - 86
  • [45] Backtrack-based Failure Recovery in Distributed Stream Processing
    Chen, Qiming
    Hsu, Meichun
    Castellanos, Malu
    2013 14TH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD 2013), 2013, : 261 - 266
  • [46] Stream data management based on integration of a stream processing engine and databases
    Kitagawa, Hiroyuki
    Watanabe, Yousuke
    2007 IFIP INTERNATIONAL CONFERENCE ON NETWORK AND PARALLEL COMPUTING WORKSHOPS, PROCEEDINGS, 2007, : 18 - +
  • [47] An empirical analysis of stateful operator migration for online scheduling in distributed stream processing systems
    Sornalakshmi, K.
    Vadivu, G.
    MICROPROCESSORS AND MICROSYSTEMS, 2023, 98
  • [48] Distributed Collaborative Filtering for Batch and Stream Processing-Based Recommendations
    Zaouali, Kais
    Haddad, Mohamed Ramzi
    Zghal, Hajer Baazaoui
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS, OTM 2018, PT I, 2018, 11229 : 243 - 260
  • [49] A Distributed Stream Processing based Architecture for IoT Smart Grids Monitoring
    Carvalho, Otavio
    Roloff, Eduardo
    Navaux, Philippe O. A.
    COMPANION PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC'17 COMPANION), 2017, : 9 - 14
  • [50] Joint Task Management of Sensor and Weapon Based on Distributed Management System
    Zhang Mingyang
    Chen Chen
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 3002 - 3007