Noah: Reinforcement-Learning-Based Rate Limiter for Microservices in Large-Scale E-Commerce Services

被引：2

作者：

Li, Zhao ^{[1
]}

Sun, Haifeng ^{[2
]}

Xiong, Zheng ^{[1
]}

Huang, Qun ^{[2
,3
]}

Hu, Zehong ^{[1
]}

Li, Ding ^{[1
]}

Ruan, Shasha ^{[1
]}

Hong, Hai ^{[1
]}

Gui, Jie ^{[2
]}

He, Jintao ^{[2
]}

Xu, Zebin ^{[1
]}

Fang, Yang ^{[1
]}

机构：

[1] Alibaba Grp, Hangzhou 311121, Peoples R China

[2] Peking Univ, Dept Comp & Sci, Beijing 100871, Peoples R China

[3] Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Containers; Microservice architectures; Production; Electronic commerce; Monitoring; Measurement; Training; Deep reinforcement learning (DRL); deployment experience; e-commerce; microservice; rate limit;

D O I：

10.1109/TNNLS.2023.3264038

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Modern large-scale online service providers typically deploy microservices into containers to achieve flexible service management. One critical problem in such container-based microservice architectures is to control the arrival rate of requests in the containers to avoid containers from being overloaded. In this article, we present our experience of rate limit for the containers in, one of the largest e-commerce services in the world. Given the highly diverse characteristics of containers in, we point out that the existing rate limit mechanisms cannot meet our demand. Thus, we design, a dynamic rate limiter that can automatically adapt to the specific characteristic of each container without human efforts. The key idea of is to use deep reinforcement learning (DRL) that automatically infers the most suitable configuration for each container. To fully embrace the advantages of DRL in our context, addresses two technical challenges. First, uses a lightweight system monitoring mechanism to collect container status. In this way, it minimizes the monitoring overhead while ensuring a timely reaction to system load changes. Second, injects synthetic extreme data when training its models. Thus, its model gains knowledge on unseen special events and hence remains highly available in extreme scenarios. To guarantee model convergence with the injected training data, adopts task-specific curriculum learning to train the model from normal data to extreme data gradually. has been deployed in the production of for two years, serving more than 50 000 containers and around 300 types of microservice applications. Experimental results show that can well adapt to three common scenarios in the production environment. It effectively achieves better system availability and shorter request response time compared with four state-of-the-art rate limiters.

引用

页码：5403 / 5417

页数：15

共 50 条

[31] COSMO: A Large-Scale E-commerce Common Sense Knowledge Generation and Serving System at Amazon
Yu, Changlong
Liu, Xin
Maia, Jefferson
Li, Yang
Cao, Tianyu
Gao, Yifan
Song, Yangqiu
Goutam, Rahul
Zhang, Haiyang
Yin, Bing
Li, Zheng
COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 148 - 160
[32] X-Engine: An Optimized Storage Engine for Large-scale E-commerce Transaction Processing
Huang, Gui
Cheng, Xuntao
Wang, Jianying
Wang, Yujie
He, Dengcheng
Zhang, Tieying
Li, Feifei
Wang, Sheng
Cao, Wei
Li, Qiang
SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 651 - 665
[33] Research on Business Model Innovation of the Traditional Large-scale Retail Enterprises' Transition to the E-commerce
Lv, Xiaoping
Liu, Xiaoli
PROCEEDING OF 2012 INTERNATIONAL SYMPOSIUM ON MANAGEMENT OF TECHNOLOGY (ISMOT'2012), 2012, : 652 - 656
[34] Large-Scale E-Commerce Image Retrieval with Top-Weighted Convolutional Neural Networks
Zhao, Shichao
Xu, Youjiang
Han, Yahong
ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, : 285 - 288
[35] Research on the Large-scale E-commerce Platform Development Mode Based on Oracle Database and Java']Java Programming Language
Wang, Meiyan
2015 3RD INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL SCIENCE, HUMANITIES, AND MANAGEMENT, ASSHM 2015, 2015, : 1082 - 1091
[36] Reinforcement-Learning-Based Multi-Objective Differential Evolution Algorithm for Large-Scale Combined Heat and Power Economic Emission Dispatch
Chen, Xu
Fang, Shuai
Li, Kangji
ENERGIES, 2023, 16 (09)
[37] Multi-robot task allocation in e-commerce RMFS based on deep reinforcement learning
Yuan, RuipinG
Dou, Jiangtao
Li, Juntao
Wang, Wei
Jiang, Yingfan
MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (02) : 1903 - 1918
[38] Real20M: A Large-scale E-commerce Dataset for Cross-domain Retrieval
Chen, Yanzhe
Zhong, Huasong
He, Xiangteng
Peng, Yuxin
Cheng, Lele
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4939 - 4948
[39] Deep Multifaceted Transformers for Multi-objective Ranking in Large-Scale E-commerce Recommender Systems
Gu, Yulong
Ding, Zhuoye
Wang, Shuaiqiang
Zou, Lixin
Liu, Yiding
Yin, Dawei
CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 2493 - 2500
[40] MEP-3M: A large-scale multi-modal E-commerce product dataset
Liu, Fan
Chen, Delong
Du, Xiaoyu
Gao, Ruizhuo
Xu, Feng
PATTERN RECOGNITION, 2023, 140

← 1 2 3 4 5 →