Automatic Music Playlist Generation via Simulation-based Reinforcement Learning

被引：0

作者：

Tomasi, Federico ^{[1
]}

Cauteruccio, Joseph ^{[2
]}

Kanoria, Surya ^{[3
]}

Ciosek, Kamil ^{[1
]}

Rinaldi, Matteo ^{[4
]}

Dai, Zhenwen ^{[1
]}

机构：

[1] Spotify, London, England

[2] Spotify, Boston, MA USA

[3] Spotify, San Francisco, CA USA

[4] Spotify, New York, NY USA

来源：

PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年

关键词：

music playlist generation; reinforcement learning; recommender systems; simulation;

D O I：

10.1145/3580305.3599777

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Personalization of playlists is a common feature in music streaming services, but conventional techniques, such as collaborative filtering, rely on explicit assumptions regarding content quality to learn how to make recommendations. Such assumptions often result in misalignment between offline model objectives and online user satisfaction metrics. In this paper, we present a reinforcement learning framework that solves for such limitations by directly optimizing for user satisfaction metrics via the use of a simulated playlist-generation environment. Using this simulator we develop and train a modified Deep Q-Network, the action head DQN (AH-DQN), in a manner that addresses the challenges imposed by the large state and action space of our RL formulation. The resulting policy is capable of making recommendations from large and dynamic sets of candidate items with the expectation of maximizing consumption metrics. We analyze and evaluate agents offline via simulations that use environment models trained on both public and proprietary streaming datasets. We show how these agents lead to better user-satisfaction metrics compared to baseline methods during online A/B tests. Finally, we demonstrate that performance assessments produced from our simulator are strongly correlated with observed online metric results.

引用

页码：4948 / 4957

页数：10

共 50 条

[41] A storage expansion planning framework using reinforcement learning and simulation-based optimization
Tsianikas, Stamatis
Yousefi, Nooshin
Zhou, Jian
Rodgers, Mark D.
Coit, David
[J]. APPLIED ENERGY, 2021, 290
[42] Automatic profile generation for UAV operators using a simulation-based training environment
Rodriguez-Fernandez, Victor
Menendez, Hector D.
Camacho, David
[J]. PROGRESS IN ARTIFICIAL INTELLIGENCE, 2016, 5 (01) : 37 - 46
[43] Automatic simulation model generation for simulation-based, real-time shop floor control
Son, YJ
Wysk, RA
[J]. COMPUTERS IN INDUSTRY, 2001, 45 (03) : 291 - 308
[44] Automatic Generation Control Based on Lagrangian Relaxation Reinforcement Learning Algorithm
Xi L.
Liu Z.
Li Y.
[J]. Zhongguo Dianji Gongcheng Xuebao/Proceedings of the Chinese Society of Electrical Engineering, 2023, 43 (04): : 1359 - 1368
[45] DEEP REINFORCEMENT LEARNING-BASED AUTOMATIC TEST PATTERN GENERATION
Li, Wenxing
Lyu, Hongqin
Liang, Shengwen
Liu, Zizhen
Lin, Ning
Wang, Zhongrui
Tian, Pengyu
Wang, Tiancheng
Li, Huawei
[J]. CONFERENCE OF SCIENCE & TECHNOLOGY FOR INTEGRATED CIRCUITS, 2024 CSTIC, 2024,
[46] Automatic Generation Control Based on Deep Reinforcement Learning With Exploration Awareness
Xi L.
Yu L.
Fu Y.
Huang Y.
Chen X.
Kang S.
[J]. Zhongguo Dianji Gongcheng Xuebao/Proceedings of the Chinese Society of Electrical Engineering, 2019, 39 (14): : 4150 - 4161
[47] Automatic Goal Generation for Reinforcement Learning Agents
Florensa, Carlos
Held, David
Geng, Xinyang
Abbeel, Pieter
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[48] A reinforcement learning approach to automatic generation control
Ahamed, TPI
Rao, PSN
Sastry, PS
[J]. ELECTRIC POWER SYSTEMS RESEARCH, 2002, 63 (01) : 9 - 26
[49] Automatic Poetry Generation with Mutual Reinforcement Learning
Yi, Xiaoyuan
Sun, Maosong
Li, Ruoyu
Li, Wenhao
[J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3143 - 3153
[50] Deep Reinforcement Learning for Automatic Thumbnail Generation
Li, Zhuopeng
Zhang, Xiaoyan
[J]. MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 41 - 53

← 1 2 3 4 5 →