RecSysOps: Best Practices for Operating a Large-Scale Recommender System

被引:2
|
作者
Saberian, Mohammad [1 ]
Basilico, Justin [1 ]
机构
[1] Netflix Inc, Los Gatos, CA 95032 USA
关键词
Recommender Systems; RecSycOps; error detection; error prediction; model diagnostic; model explainability;
D O I
10.1145/3460231.3474620
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensuring the health of a modern large-scale recommendation system is a very challenging problem. To address this, we need to put in place proper logging, sophisticated exploration policies, develop ML-interpretability tools or even train new ML models to predict/detect issues of the main production model. In this talk, we shine a light on this less-discussed but important area and share some of the best practices, called RecSysOps, that we've learned while operating our increasingly complex recommender systems at Netflix. RecSysOps is a set of best practices for identifying issues and gaps as well as diagnosing and resolving them in a large-scale machine-learned recommender system. RecSysOps helped us to 1) reduce production issues and 2) increase recommendation quality by identifying areas of improvement and 3) make it possible to bring new innovations faster to our members by enabling us to spend more of our time on new innovations and less on debugging and firefighting issues.
引用
收藏
页码:590 / 591
页数:2
相关论文
共 50 条
  • [31] Performance evaluation of a large-scale thermal power plant based on the best industrial practices
    Najjar, Yousef S. H.
    Abu-Shamleh, Amer
    [J]. SCIENTIFIC REPORTS, 2020, 10 (01)
  • [32] Large-Scale Performance and Design for Construction Activity Erosion Control Best Management Practices
    Faucette, L. B.
    Scholl, B.
    Beighley, R. E.
    Governo, J.
    [J]. JOURNAL OF ENVIRONMENTAL QUALITY, 2009, 38 (03) : 1248 - 1254
  • [33] Best practices for analyzing large-scale health data from wearables and smartphone apps
    Jennifer L. Hicks
    Tim Althoff
    Rok Sosic
    Peter Kuhar
    Bojan Bostjancic
    Abby C. King
    Jure Leskovec
    Scott L. Delp
    [J]. npj Digital Medicine, 2
  • [34] Performance evaluation of a large-scale thermal power plant based on the best industrial practices
    Yousef S. H. Najjar
    Amer Abu-Shamleh
    [J]. Scientific Reports, 10
  • [35] Requirements engineering challenges and practices in large-scale agile system development
    Kasauli, Rashidah
    Knauss, Eric
    Horkoff, Jennifer
    Liebel, Grischa
    de Oliveira Neto, Francisco Gomes
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 172
  • [36] Distributed Equivalent Substitution Training for Large-Scale Recommender Systems
    Rong, Haidong
    Wang, Yangzihao
    Zhou, Feihu
    Zhai, Junjie
    Wu, Haiyang
    Lan, Rui
    Li, Fan
    Zhang, Han
    Yang, Yuekui
    Guo, Zhenyu
    Wang, Di
    [J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 911 - 920
  • [37] Tenrec: A Large-scale Multipurpose Benchmark Dataset for Recommender Systems
    Yuan, Guanghu
    Yuan, Fajie
    Li, Yudong
    Kong, Beibei
    Li, Shujie
    Chen, Lei
    Yang, Min
    Yu, Chenyun
    Hu, Bo
    Li, Zang
    Xu, Yu
    Qie, Xiaohu
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [38] PIE: Personalized Interest Exploration for Large-Scale Recommender Systems
    Mahajan, Khushhall Chandra
    Dharwadker, Amey Porobo
    Shah, Romil
    Qu, Simeng
    Bang, Gaurav
    Schumitsch, Brad
    [J]. COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 508 - 512
  • [39] Improved Bounded Matrix Completion for Large-Scale Recommender Systems
    Fang, Huang
    Zhen, Zhang
    Shao, Yiqun
    Hsieh, Cho-Jui
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1654 - 1660
  • [40] On Some Best Practices in Large-Scale Ontology Development: The Chronious Ontology Suite as a Case Study
    Schneider, Luc
    Brochhausen, Mathias
    Koepsell, David
    [J]. FORMAL ONTOLOGIES MEET INDUSTRY: PROCEEDINGS OF THE FIFTH INTERNATIONAL WORKSHOP (FOMI 2011), 2011, 229 : 28 - 38