Dynamic clustering based contextual combinatorial multi-armed bandit for online recommendation

被引：1

作者：

Yan, Cairong ^{[1
]}

Han, Haixia ^{[1
]}

Zhang, Yanting ^{[1
]}

Zhu, Dandan ^{[1
]}

Wan, Yongquan ^{[2
,3
]}

机构：

[1] Donghua Univ, Shanghai, Peoples R China

[2] Shanghai Univ, Shanghai, Peoples R China

[3] Shanghai Jian Qiao Univ, Shanghai, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2022年 / 257卷

基金：

中国国家自然科学基金;

关键词：

Online recommendation; Dynamic clustering; Contextual multi-armed bandit; Implicit feedback;

D O I：

10.1016/j.knosys.2022.109927

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recommender systems still face a trade-off between exploring new items to maximize user satisfaction and exploiting those already interacted with to match user interests. This problem is widely recognized as the exploration/exploitation (EE) dilemma, and the multi-armed bandit (MAB) algorithm has proven to be an effective solution. As the scale of users and items in real-world application scenarios increases, their purchase interactions become sparser. Then three issues need to be investigated when building MAB-based recommender systems. First, large-scale users and sparse interactions increase the difficulty of user preference mining. Second, traditional bandits model items as arms and cannot deal with ever-growing items effectively. Third, widely used Bernoulli-based reward mechanisms only feedback 0 or 1, ignoring rich implicit feedback such as behaviors like click and add-to-cart. To address these problems, we propose an algorithm named Dynamic Clustering based Contextual Combinatorial Multi-Armed Bandits (DC(3)MAB), which consists of three configurable key components. Specifically, a dynamic user clustering strategy enables different users in the same cluster to cooperate in estimating the expected rewards of arms. A dynamic item partitioning approach based on collaborative filtering significantly reduces the scale of arms and produces a recommendation list instead of one item to provide diversity. In addition, a multi-class reward mechanism based on fine-grained implicit feedback helps better capture user preferences. Extensive empirical experiments on three real-world datasets demonstrate the superiority of our proposed DC(3)MAB over state-of-the-art bandits (On average, +75.8% in F1 and +54.3% in cumulative reward). The source code is available at https://github.com/HaixHan/DC3MAB. (c) 2022 The Author(s). Published by Elsevier B.V.

引用

页数：13

共 50 条

[1] Contextual Multi-Armed Bandit for Email Layout Recommendation
Chen, Yan
Vankov, Emilian
Baltrunas, Linas
Donovan, Preston
Mehta, Akash
Schroeder, Benjamin
Herman, Matthew
[J]. PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 400 - 402
[2] A combinatorial multi-armed bandit approach to correlation clustering
Gullo, F.
Mandaglio, D.
Tagarelli, A.
[J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 37 (04) : 1630 - 1691
[3] A combinatorial multi-armed bandit approach to correlation clustering
F. Gullo
D. Mandaglio
A. Tagarelli
[J]. Data Mining and Knowledge Discovery, 2023, 37 : 1630 - 1691
[4] Contextual Combinatorial Volatile Multi-armed Bandit with Adaptive Discretization
Nika, Andi
Elahi, Sepehr
Tekin, Cem
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 1486 - 1495
[5] Two-Phase Multi-armed Bandit for Online Recommendation
Yan, Cairong
Han, Haixia
Wang, Zijian
Zhang, Yanting
[J]. 2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2021,
[6] CAREForMe: Contextual Multi-Armed Bandit Recommendation Framework for Mental Health
Yu, Sheng
Nourzad, Narjes
Semple, Randye J.
Zhao, Yixue
Zhou, Emily
Krishnamachari, Bhaskar
[J]. PROCEEDINGS OF THE 2024 IEEE/ACM 11TH INTERNATIONAL CONFERENCE ON MOBILE SOFTWARE ENGINEERING AND SYSTEMS, MOBILESOFT 2024, 2024, : 92 - 94
[7] Dynamic Consensus Community Detection and Combinatorial Multi-Armed Bandit
Mandaglio, Domenico
Tagarelli, Andrea
[J]. PROCEEDINGS OF THE 2019 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2019), 2019, : 184 - 187
[8] Multi-armed bandit problem with online clustering as side information
Dzhoha, Andrii
Rozora, Iryna
[J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2023, 427
[9] Variational inference for the multi-armed contextual bandit
Urteaga, Inigo
Wiggins, Chris H.
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[10] Dynamic Multi-Armed Bandit with Covariates
Pavlidis, Nicos G.
Tasoulis, Dimitris K.
Adams, Niall M.
Hand, David J.
[J]. ECAI 2008, PROCEEDINGS, 2008, 178 : 777 - +

← 1 2 3 4 5 →