Training Recommenders Over Large Item Corpus With Importance Sampling

被引：0

作者：

Lian, Defu ^{[1
]}

Gao, Zhenguo ^{[2
]}

Song, Xia ^{[2
]}

Li, Yucheng ^{[1
]}

Liu, Qi ^{[1
]}

Chen, Enhong ^{[1
]}

机构：

[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230052, Anhui, Peoples R China

[2] Shanghai Jiao Tong Univ, Sch Math Sci, Shanghai 200240, Peoples R China

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2024年 / 36卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Personalized ranking; cluster-based sampling; implicit feedback; item recommendation;

D O I：

10.1109/TKDE.2023.3344657

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

By predicting a personalized ranking on a set of items, item recommendation helps users determine the information they need. While optimizing a ranking-focused loss is more in line with the objectives of item recommendation, previous studies have indicated that current sampling-based ranking methods don't always surpass non-sampling ones. This is because it is either inefficient to sample a pool of representative negatives for better generalization or challenging to gauge their contributions to ranking-focused losses accurately. To this end, we propose a novel weighted ranking loss, which weights each negative with the softmax probability based on model's predictive score. Our theoretical analysis suggests that optimizing this loss boosts the normalized discounted cumulative gain. Furthermore, it appears that this loss acts as an approximate analytic solution for adversarial training of personalized ranking. To improve optimization efficiency, we approximate the weighted ranking loss with self-normalized importance sampling and show that the loss has good generalization properties. To improve generalization, we further develop efficient cluster-based negative samplers based on clustering over item vectors, to decrease approximation error caused by the divergence between the proposal and the target distribution. Comprehensive evaluations on real-world datasets show that our methods remarkably outperform leading item recommendation algorithms.

引用

页码：9433 / 9447

页数：15

共 50 条

[41] LARGE SAMPLE BEHAVIOR OF THE CTE AND VAR ESTIMATORS UNDER IMPORTANCE SAMPLING
Ahn, Jae
Shyamalkumar, Nariankadu
NORTH AMERICAN ACTUARIAL JOURNAL, 2011, 15 (03) : 393 - 416
[42] Large Deviations and Importance Sampling for Systems of Slow-Fast Motion
Spiliopoulos, Konstantinos
APPLIED MATHEMATICS AND OPTIMIZATION, 2013, 67 (01): : 123 - 161
[43] Importance sampling large deviations in nonequilibrium steady states. I
Ray, Ushnish
Chan, Garnet Kin-Lic
Limmer, David T.
JOURNAL OF CHEMICAL PHYSICS, 2018, 148 (12):
[44] Large Deviations and Importance Sampling for Systems of Slow-Fast Motion
Konstantinos Spiliopoulos
Applied Mathematics & Optimization, 2013, 67 : 123 - 161
[45] EFFICIENT IMPORTANCE SAMPLING TECHNIQUES FOR LARGE DIMENSIONAL AND MULTIMODAL POSTERIOR COMPUTATIONS
Das, Samaijit
Vaswani, Namrata
2009 IEEE 13TH DIGITAL SIGNAL PROCESSING WORKSHOP & 5TH IEEE PROCESSING EDUCATION WORKSHOP, VOLS 1 AND 2, PROCEEDINGS, 2009, : 274 - 279
[46] Efficient training of physics-informed neural networks via importance sampling
Nabian, Mohammad Amin
Gladstone, Rini Jasmine
Meidani, Hadi
COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2021, 36 (08) : 962 - 977
[47] A Corpus-Based Sampling to Build Training Data Set for Extracting Japanese Sentence Pattern
Liu, Jun
Ning, Yihaoran
Fang, Yuanyu
Zhuang, Luxuan
Yu, Zhuohan
Wu, Tingkun
2022 11TH INTERNATIONAL CONFERENCE ON EDUCATIONAL AND INFORMATION TECHNOLOGY (ICEIT 2022), 2022, : 123 - 128
[48] DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus
Bruemmer, Martin
Dojchinovski, Milan
Hellmann, Sebastian
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3339 - 3343
[49] A fast importance sampling algorithm for unsupervised learning of over-complete dictionaries
Blumensath, T
Davies, M
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 213 - 216
[50] IMPORTANCE SAMPLING FOR TCM SCHEME OVER NON-GAUSSIAN NOISE CHANNEL
SAKAI, T
OGIWARA, H
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1995, E78A (09) : 1109 - 1116

← 1 2 3 4 5 →