AdaEmbed: Adaptive Embedding for Large-Scale Recommendation Models

被引:0
|
作者
Lai, Fan [2 ]
Zhang, Wei [1 ]
Liu, Rui [1 ]
Tsai, William [1 ]
Wei, Xiaohan [1 ]
Hu, Yuxi [1 ]
Devkota, Sabin [1 ]
Huang, Jianyu [1 ]
Park, Jongsoo [1 ]
Liu, Xing [1 ]
Chen, Zeliang [1 ]
Wen, Ellie [1 ]
Rivera, Paul [1 ]
You, Jie [1 ]
Chen, Chun-Cheng Jason [1 ]
Chowdhury, Mosharaf [2 ]
机构
[1] Meta, Menlo Pk, CA USA
[2] Univ Michigan, Ann Arbor, MI USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning recommendation models (DLRMs) are using increasingly larger embedding tables to represent categorical sparse features such as video genres. Each embedding row of the table represents the trainable weight vector for a specific instance of that feature. While increasing the number of embedding rows typically improves model accuracy by considering more feature instances, it can lead to larger deployment costs and slower model execution. Unlike existing efforts that primarily focus on optimizing DLRMs for the given embedding, we present a complementary system, AdaEmbed, to reduce the size of embeddings needed for the same DLRM accuracy via in-training embedding pruning. Our key insight is that the access patterns and weights of different embeddings are heterogeneous across embedding rows, and dynamically change over the training process, implying varying embedding importance with respect to model accuracy. However, identifying important embeddings and then enforcing pruning for modern DLRMs with up to billions of embeddings (terabytes) is challenging. Given the total embedding size, AdaEmbed considers embeddings with higher runtime access frequencies and larger training gradients to be more important, and it dynamically prunes less important embeddings at scale to automatically determine per-feature embeddings. Our evaluations in industrial settings show that AdaEmbed saves 35-60% embedding size needed in deployment and improves model execution speed by 11-34%, while achieving noticeable accuracy gains.
引用
收藏
页码:817 / 831
页数:15
相关论文
共 50 条
  • [1] Local Factor Models for Large-Scale Inductive Recommendation
    Yang, Longqi
    Schnabel, Tobias
    Bennett, Paul N.
    Dumais, Susan
    [J]. 15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021), 2021, : 252 - 262
  • [2] EANA: Reducing Privacy Risk on Large-scale Recommendation Models
    Berlowitz, Devora
    Chen, Mei
    Chien, Steve
    Ning, Lin
    Song, Shuang
    Xue, Yunqi
    [J]. PROCEEDINGS OF THE 16TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2022, 2022, : 399 - 407
  • [3] Adaptive Word Embedding Module for Semantic Reasoning in Large-scale Detection
    Zhang, Yu
    Wu, Xiaoyu
    Zhu, Ruolin
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2103 - 2109
  • [4] OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Model
    Wang, Zheng
    Wang, Yuke
    Feng, Boyuan
    Huang, Guyue
    Mudigere, Dheevatsa
    Muthiah, Bharath
    Li, Ang
    Ding, Yufei
    [J]. PROCEEDINGS OF THE 2024 USENIX ANNUAL TECHNICAL CONFERENCE, ATC 2024, 2024, : 667 - 682
  • [5] Beyond User Embedding Matrix: Learning to Hash for Modeling Large-Scale Users in Recommendation
    Shi, Shaoyun
    Ma, Weizhi
    Zhang, Min
    Zhang, Yongfeng
    Yu, Xinxing
    Shan, Houzhi
    Liu, Yiqun
    Ma, Shaoping
    [J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 319 - 328
  • [6] Personalized POI Embedding for Successive POI Recommendation with Large-scale Smart Card Data
    Kim, Jin-Young
    Lim, Kyung-Hyun
    Cho, Sung-Bae
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 3583 - 3589
  • [7] Large-Scale Patch Recommendation at Alibaba
    Zhang, Xindong
    Zhu, Chenguang
    Li, Yi
    Guo, Jianmei
    Liu, Lihua
    Gu, Haobo
    [J]. 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2020), 2020, : 252 - 253
  • [8] Large-scale Recommendation for Portfolio Optimization
    Swezey, Robin M. E.
    Charron, Bruno
    [J]. 12TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS), 2018, : 382 - 386
  • [9] Large-Scale Heterogeneous Feature Embedding
    Huang, Xiao
    Song, Qingquan
    Yang, Fan
    Hu, Xia
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3878 - 3885
  • [10] SSDC: A Scalable Sparse Differential Checkpoint for Large-scale Deep Recommendation Models
    Xiang, Lingrui
    Lu, Xiaofen
    Zhang, Rui
    Hu, Zheng
    [J]. 2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,