Training Recommenders Over Large Item Corpus With Importance Sampling

被引:0
|
作者
Lian, Defu [1 ]
Gao, Zhenguo [2 ]
Song, Xia [2 ]
Li, Yucheng [1 ]
Liu, Qi [1 ]
Chen, Enhong [1 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230052, Anhui, Peoples R China
[2] Shanghai Jiao Tong Univ, Sch Math Sci, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Personalized ranking; cluster-based sampling; implicit feedback; item recommendation;
D O I
10.1109/TKDE.2023.3344657
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
By predicting a personalized ranking on a set of items, item recommendation helps users determine the information they need. While optimizing a ranking-focused loss is more in line with the objectives of item recommendation, previous studies have indicated that current sampling-based ranking methods don't always surpass non-sampling ones. This is because it is either inefficient to sample a pool of representative negatives for better generalization or challenging to gauge their contributions to ranking-focused losses accurately. To this end, we propose a novel weighted ranking loss, which weights each negative with the softmax probability based on model's predictive score. Our theoretical analysis suggests that optimizing this loss boosts the normalized discounted cumulative gain. Furthermore, it appears that this loss acts as an approximate analytic solution for adversarial training of personalized ranking. To improve optimization efficiency, we approximate the weighted ranking loss with self-normalized importance sampling and show that the loss has good generalization properties. To improve generalization, we further develop efficient cluster-based negative samplers based on clustering over item vectors, to decrease approximation error caused by the divergence between the proposal and the target distribution. Comprehensive evaluations on real-world datasets show that our methods remarkably outperform leading item recommendation algorithms.
引用
收藏
页码:9433 / 9447
页数:15
相关论文
共 50 条
  • [1] Sampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations
    Yi, Xinyang
    Yang, Ji
    Hong, Lichan
    Cheng, Derek Zhiyuan
    Heldt, Lukasz
    Kumthekar, Aditee
    Zhao, Zhe
    Wei, Li
    Chi, Ed
    RECSYS 2019: 13TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, 2019, : 269 - 277
  • [2] RecJPQ: Training Large-Catalogue Sequential Recommenders
    Petrov, Aleksandr, V
    Macdonald, Craig
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 538 - 547
  • [3] Cross-Batch Negative Sampling for Training Two-Tower Recommenders
    Wang, Jinpeng
    Zhu, Jieming
    He, Xiuqiang
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1632 - 1636
  • [4] Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks
    Zou, Difan
    Hu, Ziniu
    Wang, Yewen
    Jiang, Song
    Sun, Yizhou
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [5] Training Large-Scale News Recommenders with Pretrained Language Models in the Loop
    Xiao, Shitao
    Liu, Zheng
    Shao, Yingxia
    Di, Tao
    Middha, Bhuvan
    Wu, Fangzhao
    Xie, Xing
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4215 - 4225
  • [6] Importance sampling in neural detector training phase
    Andina, D
    Martínez-Antorrena, J
    Melgar, I
    Soft Computing with Industrial Applications, Vol 17, 2004, 17 : 43 - 48
  • [7] Enhancing Siamese Networks Training with Importance Sampling
    Shrestha, Ajay
    Mahmood, Ausif
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 610 - 615
  • [8] COUNTEREXAMPLES IN IMPORTANCE SAMPLING FOR LARGE DEVIATIONS PROBABILITIES
    Glasserman, Paul
    Wang, Yashan
    ANNALS OF APPLIED PROBABILITY, 1997, 7 (03): : 731 - 746
  • [9] The Norwegian Colossal Corpus: A Text Corpus for Training Large Norwegian Language Models
    Kummervold, Per E.
    Wetjen, Freddy
    de la Rosa, Javier
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3852 - 3860
  • [10] Large Language Models are Competitive Near Cold-start Recommenders for Language- and Item-based Preferences
    Sanner, Scott
    Balog, Krisztian
    Radlinski, Filip
    Wedin, Ben
    Dixon, Lucas
    PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 890 - 896