LARA: Latency-Aware Resource Allocator for Stream Processing Applications

被引:0
|
作者
Benedetti, Priscilla [1 ]
Coviello, Giuseppe [1 ]
Rao, Kunal [1 ]
Chakradhar, Srimat [1 ]
机构
[1] NEC Labs Amer Inc, Princeton, NJ 95110 USA
关键词
Stream processing; microservices; latency minimization; autoscaling; Kubernetes;
D O I
10.1109/PDP62718.2024.00018
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
One of the key metrics of interest for stream processing applications is "latency", which indicates the total time it takes for the application to process and generate insights from streaming input data. For mission-critical video analytics applications like surveillance and monitoring, it is of paramount importance to report an incident as soon as it occurs so that necessary actions can be taken right away. Stream processing applications are typically developed as a chain of microservices and are deployed on container orchestration platforms like Kubernetes. Allocation of system resources like "cpu" and "memory" to individual application microservices has direct impact on "latency". Kubernetes does provide ways to allocate these resources e.g. through fixed resource allocation or through vertical pod autoscaler (VPA), however there is no straight-forward way in Kubernetes to prioritize "latency" for an end-toend application pipeline. In this paper, we present LARA, which is specifically designed to improve "latency" of stream processing application pipelines. LARA uses a regression-based technique for resource allocation to individual microservices. We implement four real-world video analytics application pipelines i.e. license plate recognition, face recognition, human attributes detection and pose detection, and show that compared to fixed allocation, LARA is able to reduce latency by up to similar to 2.8X and is consistently better than VPA. While reducing latency, LARA is also able to deliver over 2X throughput compared to fixed allocation and is almost always better than VPA.
引用
收藏
页码:68 / 77
页数:10
相关论文
共 50 条
  • [21] Combined Latency-Aware and Resource-Effective Virtual Network Function Placement
    Attaoui, Wissal
    Sabir, Essaid
    Elbiaze, Halima
    Sadik, Mohamed
    2020 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2020,
  • [22] Latency-aware failover strategies for containerized web applications in distributed clouds
    Aldwyan, Yasser
    Sinnott, Richard O.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 101 : 1081 - 1095
  • [23] Regulations and latency-aware load distribution of web applications in Multi-Clouds
    Nikolay Grozev
    Rajkumar Buyya
    The Journal of Supercomputing, 2016, 72 : 3261 - 3280
  • [24] Latency-aware Hybrid Edge Cloud Framework for Mobile Augmented Reality Applications
    Younis, Ayman
    Qiu, Brian
    Pompili, Dario
    2020 17TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON SENSING, COMMUNICATION, AND NETWORKING (SECON), 2020,
  • [25] Regulations and latency-aware load distribution of web applications in Multi-Clouds
    Grozev, Nikolay
    Buyya, Rajkumar
    JOURNAL OF SUPERCOMPUTING, 2016, 72 (08): : 3261 - 3280
  • [26] Latency-Aware Resource-Efficient Virtual Network Embedding in Software Defined Networking
    Yan, Zihui
    Wei, Ning
    Jin, Qizhen
    Zhou, Xiaobo
    2019 28TH WIRELESS AND OPTICAL COMMUNICATIONS CONFERENCE (WOCC), 2019, : 82 - 86
  • [27] Offloading Demand Prediction-Driven Latency-Aware Resource Reservation in Edge Networks
    Zhang, Jianhui
    Wang, Jiacheng
    Yuan, Zhongyin
    Zhang, Wanqing
    Liu, Liming
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (15) : 13826 - 13836
  • [28] Reliable Latency-Aware Routing for Clustered WSNs
    Tufail, Ali
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2012,
  • [29] PLATO: Predictive latency-aware total ordering
    Balakrishnan, Mahesh
    Birman, Ken
    Phanishayee, Amar
    SRDS 2006: 25TH IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 2006, : 175 - 185
  • [30] Latency-aware Service Placement for GenAI at the Edge
    Thapa, Bipul B.
    Mashayekhy, Lena
    DISRUPTIVE TECHNOLOGIES IN INFORMATION SCIENCES VIII, 2024, 13058