LARA: Latency-Aware Resource Allocator for Stream Processing Applications

被引:0
|
作者
Benedetti, Priscilla [1 ]
Coviello, Giuseppe [1 ]
Rao, Kunal [1 ]
Chakradhar, Srimat [1 ]
机构
[1] NEC Labs Amer Inc, Princeton, NJ 95110 USA
关键词
Stream processing; microservices; latency minimization; autoscaling; Kubernetes;
D O I
10.1109/PDP62718.2024.00018
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
One of the key metrics of interest for stream processing applications is "latency", which indicates the total time it takes for the application to process and generate insights from streaming input data. For mission-critical video analytics applications like surveillance and monitoring, it is of paramount importance to report an incident as soon as it occurs so that necessary actions can be taken right away. Stream processing applications are typically developed as a chain of microservices and are deployed on container orchestration platforms like Kubernetes. Allocation of system resources like "cpu" and "memory" to individual application microservices has direct impact on "latency". Kubernetes does provide ways to allocate these resources e.g. through fixed resource allocation or through vertical pod autoscaler (VPA), however there is no straight-forward way in Kubernetes to prioritize "latency" for an end-toend application pipeline. In this paper, we present LARA, which is specifically designed to improve "latency" of stream processing application pipelines. LARA uses a regression-based technique for resource allocation to individual microservices. We implement four real-world video analytics application pipelines i.e. license plate recognition, face recognition, human attributes detection and pose detection, and show that compared to fixed allocation, LARA is able to reduce latency by up to similar to 2.8X and is consistently better than VPA. While reducing latency, LARA is also able to deliver over 2X throughput compared to fixed allocation and is almost always better than VPA.
引用
收藏
页码:68 / 77
页数:10
相关论文
共 50 条
  • [31] Building latency-aware overlay topologies with QuickPeer
    Ceccanti, A
    Jesi, GP
    2005 JOINT INTERNATIONAL CONFERENCE ON AUTONOMIC AND AUTONOMOUS SYSTEMS AND INTERNATIONAL CONFERENCE ON NETWORKING AND SERVICES (ICAS/ICNS), 2005, : 148 - 153
  • [32] Latency-Aware Optimisation Framework for Cloudlet Placement
    Wong, Elaine
    Mondal, Sourav
    Das, Goutam
    2017 19TH INTERNATIONAL CONFERENCE ON TRANSPARENT OPTICAL NETWORKS (ICTON), 2017,
  • [33] Latency-aware reinforced routing for opportunistic networks
    Sharma, Deepak Kumar
    Gupta, Sarthak
    Malik, Shubham
    Kumar, Rohit
    IET COMMUNICATIONS, 2020, 14 (17) : 2981 - 2989
  • [34] Toward Latency-Aware Dynamic Middlebox Scheduling
    Duan, Pengfei
    Li, Qing
    Jiang, Yong
    Xia, Shu-Tao
    24TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS ICCCN 2015, 2015,
  • [35] Latency-aware publish/subscribe systems on MANET
    Lahyani, Imene
    Jmaiel, Mohamed
    Drira, Khalil
    Chassot, Christophe
    International Journal of Wireless and Mobile Computing, 2015, 8 (03) : 236 - 248
  • [36] Topology-aware task allocation for online distributed stream processing applications with latency constraints
    Wei, Xiaohui
    Wei, Xun
    Li, Hongliang
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2019, 534
  • [37] Latency-Aware Horizontal Computation Offloading for Parallel Processing in Fog-Enabled IoT
    Deb, Pallav Kumar
    Misra, Sudip
    Mukherjee, Anandarup
    IEEE SYSTEMS JOURNAL, 2022, 16 (02): : 2537 - 2544
  • [38] Efficient cooperative cache management for latency-aware data intelligent processing in edge environment
    Li, Chunlin
    Liu, Jun
    Zhang, Qingchuan
    Luo, Youlong
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 123 : 48 - 67
  • [39] Latency-Aware Resource Allocation for Mobile Edge Generation and Computing via Deep Reinforcement Learning
    Wu, Yinyu
    Zhang, Xuhui
    Ren, Jinke
    Xing, Huijun
    Shen, Yanyan
    Cui, Shuguang
    IEEE Networking Letters, 2024, 6 (04): : 237 - 241
  • [40] Latency-Aware Resource Optimization for Next-Generation Wireless-Wireline Convergent Networks
    Nizam, Fareha
    Chuah, Teong Chee
    Lee, Ying Loong
    IEEE ACCESS, 2024, 12 : 120661 - 120671