LARA: Latency-Aware Resource Allocator for Stream Processing Applications

被引:0
|
作者
Benedetti, Priscilla [1 ]
Coviello, Giuseppe [1 ]
Rao, Kunal [1 ]
Chakradhar, Srimat [1 ]
机构
[1] NEC Labs Amer Inc, Princeton, NJ 95110 USA
关键词
Stream processing; microservices; latency minimization; autoscaling; Kubernetes;
D O I
10.1109/PDP62718.2024.00018
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
One of the key metrics of interest for stream processing applications is "latency", which indicates the total time it takes for the application to process and generate insights from streaming input data. For mission-critical video analytics applications like surveillance and monitoring, it is of paramount importance to report an incident as soon as it occurs so that necessary actions can be taken right away. Stream processing applications are typically developed as a chain of microservices and are deployed on container orchestration platforms like Kubernetes. Allocation of system resources like "cpu" and "memory" to individual application microservices has direct impact on "latency". Kubernetes does provide ways to allocate these resources e.g. through fixed resource allocation or through vertical pod autoscaler (VPA), however there is no straight-forward way in Kubernetes to prioritize "latency" for an end-toend application pipeline. In this paper, we present LARA, which is specifically designed to improve "latency" of stream processing application pipelines. LARA uses a regression-based technique for resource allocation to individual microservices. We implement four real-world video analytics application pipelines i.e. license plate recognition, face recognition, human attributes detection and pose detection, and show that compared to fixed allocation, LARA is able to reduce latency by up to similar to 2.8X and is consistently better than VPA. While reducing latency, LARA is also able to deliver over 2X throughput compared to fixed allocation and is almost always better than VPA.
引用
收藏
页码:68 / 77
页数:10
相关论文
共 50 条
  • [1] Latency-Aware Placement of Stream Processing Operators
    Ecker, Raphael
    Karagiannis, Vasileios
    Sober, Michael
    Ebrahimi, Elmira
    Schulte, Stefan
    EURO-PAR 2023: PARALLEL PROCESSING WORKSHOPS, PT I, EURO-PAR 2023, 2024, 14351 : 30 - 41
  • [2] Latency-aware decentralized resource management for IoT applications
    Avasalcai, Cosmin
    Dustdar, Schahram
    PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON THE INTERNET OF THINGS (IOT'18), 2018,
  • [3] Latency-Aware Secure Elastic Stream Processing with Homomorphic Encryption
    Arosha Rodrigo
    Miyuru Dayarathna
    Sanath Jayasena
    Data Science and Engineering, 2019, 4 : 223 - 239
  • [4] Latency-Aware Secure Elastic Stream Processing with Homomorphic Encryption
    Rodrigo, Arosha
    Dayarathna, Miyuru
    Jayasena, Sanath
    DATA SCIENCE AND ENGINEERING, 2019, 4 (03) : 223 - 239
  • [5] Latency-Aware Strategies for Deploying Data Stream Processing Applications on Large Cloud-Edge Infrastructure
    Veith, Alexandre da Silva
    de Assuncao, Marcos Dias
    Lefevre, Laurent
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (01) : 445 - 456
  • [6] Latency-Aware Resource Allocation in Green Fog Networks for Industrial IoT Applications
    Basir, Rabeea
    Qaisar, Saad B.
    Ali, Mudassar
    Naeem, Muhammad
    Joshi, Kishor Chandra
    Rodriguez, Jonathan
    2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS), 2020,
  • [7] Energy and Latency-aware Resource Reconfiguration in Fog Environments
    Godinho, Noe
    Silva, Henrique
    Curado, Marilia
    Paquete, Luis
    2020 IEEE 19TH INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA), 2020,
  • [8] Latency-Aware Placement of Data Stream Analytics on Edge Computing
    Veith, Alexandre da Silva
    de Assuncao, Marcos Dias
    Lefevre, Laurent
    SERVICE-ORIENTED COMPUTING (ICSOC 2018), 2018, 11236 : 215 - 229
  • [9] Latency-Aware Task Partitioning and Resource Allocation in Fog Networks
    Saxena, Mohit Kumar
    Kumar, Sudhir
    2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
  • [10] Quokka: Latency-Aware Middlebox Scheduling with dynamic resource allocation
    Li, Qing
    Jiang, Yong
    Duan, Pengfei
    Xu, Mingwei
    Xiao, Xi
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2017, 78 : 253 - 266