RecSysOps: Best Practices for Operating a Large-Scale Recommender System

被引:2
|
作者
Saberian, Mohammad [1 ]
Basilico, Justin [1 ]
机构
[1] Netflix Inc, Los Gatos, CA 95032 USA
关键词
Recommender Systems; RecSycOps; error detection; error prediction; model diagnostic; model explainability;
D O I
10.1145/3460231.3474620
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensuring the health of a modern large-scale recommendation system is a very challenging problem. To address this, we need to put in place proper logging, sophisticated exploration policies, develop ML-interpretability tools or even train new ML models to predict/detect issues of the main production model. In this talk, we shine a light on this less-discussed but important area and share some of the best practices, called RecSysOps, that we've learned while operating our increasingly complex recommender systems at Netflix. RecSysOps is a set of best practices for identifying issues and gaps as well as diagnosing and resolving them in a large-scale machine-learned recommender system. RecSysOps helped us to 1) reduce production issues and 2) increase recommendation quality by identifying areas of improvement and 3) make it possible to bring new innovations faster to our members by enabling us to spend more of our time on new innovations and less on debugging and firefighting issues.
引用
收藏
页码:590 / 591
页数:2
相关论文
共 50 条
  • [1] Best practices for interpreting large-scale replications
    Joshua M. Ackerman
    [J]. Nature Human Behaviour, 2018, 2 : 712 - 712
  • [2] Best practices for interpreting large-scale replications
    Ackerman, Joshua M.
    [J]. NATURE HUMAN BEHAVIOUR, 2018, 2 (10): : 712 - 712
  • [3] DIPS LARGE-SCALE OPERATING SYSTEM
    TAKAMURA, S
    OHSHIMA, Y
    TOH, T
    ITOH, Y
    OGINO, S
    [J]. REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1980, 28 (3-4): : 161 - 175
  • [4] Best Practices for Large-Scale Signal Conditioning Systems
    Padhye, Swapnil
    [J]. IEEE AEROSPACE AND ELECTRONIC SYSTEMS MAGAZINE, 2009, 24 (11) : 36 - 40
  • [5] Best Practices for Deploying a CMDB in large-scale Environments
    Keller, Alexander
    Subramanian, Suraj
    [J]. 2009 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM 2009) VOLS 1 AND 2, 2009, : 732 - 745
  • [6] Best Practices for Large-Scale Signal Conditioning Systems
    Padhye, Swapnil
    [J]. 2008 IEEE AUTOTESTCON, VOLS 1 AND 2, 2008, : 302 - 307
  • [7] AiRS: A Large-Scale Recommender System at NAVER News
    Lim, Hongjun
    Lee, Yeon-Chang
    Lee, Jin-Seo
    Han, Sanggyu
    Kim, Seunghyeon
    Jeong, Yeongjong
    Kim, Changbong
    Kim, Jaehun
    Han, Sunghoon
    Choi, Solbi
    Ko, Hanjong
    Lee, Dokyeong
    Choi, Jaeho
    Kim, Yungi
    Bae, Hong-Kyun
    Kim, Taeho
    Ahn, Jeewon
    You, Hyun-Soung
    Kim, Sang-Wook
    [J]. 2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 3386 - 3398
  • [8] Large-scale recommender system with compact latent factor model
    Liu, Chien-Liang
    Wu, Xuan-Wei
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2016, 64 : 467 - 475
  • [9] Second Workshop on Large-Scale Recommender Systems: Research and Best Practice (LSRS 2014)
    Ye, Tao
    Bickson, Danny
    Yan, Qiang
    [J]. PROCEEDINGS OF THE 8TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'14), 2014, : 385 - 386
  • [10] REST APIs: A Large-Scale Analysis of Compliance with Principles and Best Practices
    Rodriguez, Carlos
    Baez, Marcos
    Daniel, Florian
    Casati, Fabio
    Trabucco, Juan Carlos
    Canali, Luigi
    Percannella, Gianraffaele
    [J]. WEB ENGINEERING (ICWE 2016), 2016, 9671 : 21 - 39