Training Recommenders Over Large Item Corpus With Importance Sampling

被引：0

作者：

Lian, Defu ^{[1
]}

Gao, Zhenguo ^{[2
]}

Song, Xia ^{[2
]}

Li, Yucheng ^{[1
]}

Liu, Qi ^{[1
]}

Chen, Enhong ^{[1
]}

机构：

[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230052, Anhui, Peoples R China

[2] Shanghai Jiao Tong Univ, Sch Math Sci, Shanghai 200240, Peoples R China

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2024年 / 36卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Personalized ranking; cluster-based sampling; implicit feedback; item recommendation;

D O I：

10.1109/TKDE.2023.3344657

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

By predicting a personalized ranking on a set of items, item recommendation helps users determine the information they need. While optimizing a ranking-focused loss is more in line with the objectives of item recommendation, previous studies have indicated that current sampling-based ranking methods don't always surpass non-sampling ones. This is because it is either inefficient to sample a pool of representative negatives for better generalization or challenging to gauge their contributions to ranking-focused losses accurately. To this end, we propose a novel weighted ranking loss, which weights each negative with the softmax probability based on model's predictive score. Our theoretical analysis suggests that optimizing this loss boosts the normalized discounted cumulative gain. Furthermore, it appears that this loss acts as an approximate analytic solution for adversarial training of personalized ranking. To improve optimization efficiency, we approximate the weighted ranking loss with self-normalized importance sampling and show that the loss has good generalization properties. To improve generalization, we further develop efficient cluster-based negative samplers based on clustering over item vectors, to decrease approximation error caused by the divergence between the proposal and the target distribution. Comprehensive evaluations on real-world datasets show that our methods remarkably outperform leading item recommendation algorithms.

引用

页码：9433 / 9447

页数：15

共 50 条

[1] Sampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations
Yi, Xinyang
Yang, Ji
Hong, Lichan
Cheng, Derek Zhiyuan
Heldt, Lukasz
Kumthekar, Aditee
Zhao, Zhe
Wei, Li
Chi, Ed
RECSYS 2019: 13TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, 2019, : 269 - 277
[2] RecJPQ: Training Large-Catalogue Sequential Recommenders
Petrov, Aleksandr, V
Macdonald, Craig
PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 538 - 547
[3] Cross-Batch Negative Sampling for Training Two-Tower Recommenders
Wang, Jinpeng
Zhu, Jieming
He, Xiuqiang
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1632 - 1636
[4] Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks
Zou, Difan
Hu, Ziniu
Wang, Yewen
Jiang, Song
Sun, Yizhou
Gu, Quanquan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[5] Training Large-Scale News Recommenders with Pretrained Language Models in the Loop
Xiao, Shitao
Liu, Zheng
Shao, Yingxia
Di, Tao
Middha, Bhuvan
Wu, Fangzhao
Xie, Xing
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4215 - 4225
[6] Importance sampling in neural detector training phase
Andina, D
Martínez-Antorrena, J
Melgar, I
Soft Computing with Industrial Applications, Vol 17, 2004, 17 : 43 - 48
[7] Enhancing Siamese Networks Training with Importance Sampling
Shrestha, Ajay
Mahmood, Ausif
PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 610 - 615
[8] COUNTEREXAMPLES IN IMPORTANCE SAMPLING FOR LARGE DEVIATIONS PROBABILITIES
Glasserman, Paul
Wang, Yashan
ANNALS OF APPLIED PROBABILITY, 1997, 7 (03): : 731 - 746
[9] The Norwegian Colossal Corpus: A Text Corpus for Training Large Norwegian Language Models
Kummervold, Per E.
Wetjen, Freddy
de la Rosa, Javier
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3852 - 3860
[10] Large Language Models are Competitive Near Cold-start Recommenders for Language- and Item-based Preferences
Sanner, Scott
Balog, Krisztian
Radlinski, Filip
Wedin, Ben
Dixon, Lucas
PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 890 - 896

← 1 2 3 4 5 →