Policy GRU-RL: Simplified Music Playlist Recommendation Using Sequential on Reinforcement Learning Concept

被引：0

作者：

Chanarong, Chanapa ^{[1
]}

Maneeroj, Saranya ^{[1
]}

机构：

[1] Chulalongkorn Univ, Fac Sci, Dept Math & Comp Sci, Bangkok, Thailand

来源：

2024 21ST INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING, JCSSE 2024 | 2024年

关键词：

D O I：

10.1109/JCSSE61278.2024.10613646

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In the realm of streaming services, recommendation systems play a crucial role in meeting user preferences by aiding them in discovering music tailored to their tastes. Reinforcement learning (RL) stands out as a popular method for music recommendations. Nevertheless, prior approaches have grappled with the challenge of over-fitting. After a certain learning period, the agent may struggle to predict actions solely based on past interactions, posing issues for the current user context. To address this limitation, previous methods must be retrained by resetting all parameters in the agent. This study introduces the Policy GRU-RL method, which combines sequential-based learning and reinforcement learning to tackle over-fitting without the necessity of resetting all parameters. This method capitalizes on the features of a recurrent network by implementing an epsilon-greedy policy within the GRU gate. An update gate in the GRU determines whether to choose the random action (current input of the GRU cell) or the optimal action (information from the preceding GRU cell, containing actions with maximum rewards). Additionally, it carries " and action values through each iteration, assessing over-fitting by checking for duplicated predicted actions in specific i terations. Subsequently, the " parameter in the agent is reset. The results demonstrate that our proposed Policy GRU-RL surpasses baseline approaches in terms of accuracy.

引用

页码：551 / 557

页数：7

共 11 条

[1] DJ-MC: A Reinforcement-Learning Agent for Music Playlist Recommendation
Liebman, Elad
Saar-Tsechansky, Maytal
Stone, Peter
[J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 591 - 599
[2] Music Playlist Generation Based on Graph Exploration Using Reinforcement Learning
Sakurai, Keigo
Togo, Ren
Ogawa, Takahiro
Haseyama, Miki
[J]. 2021 IEEE 3RD GLOBAL CONFERENCE ON LIFE SCIENCES AND TECHNOLOGIES (IEEE LIFETECH 2021), 2021, : 53 - 54
[3] Sequential Recommendation Using Deep Reinforcement Learning and Multi-Head Attention
Sultan, Raneem
Abu-Elkheir, Mervat
[J]. 2022 56TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2022, : 258 - 262
[4] RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning
Jiang, Nan
Jin, Sheng
Duan, Zhiyao
Zhang, Changshui
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 710 - 718
[5] Deep Reinforcement Learning-based Music Recommendation with Knowledge Graph Using Acoustic Features
Sakurai, Keigo
Togo, Ren
Ogawa, Takahiro
Haseyama, Miki
[J]. ITE TRANSACTIONS ON MEDIA TECHNOLOGY AND APPLICATIONS, 2022, 10 (01): : 8 - 17
[6] Bayesian sequential optimal experimental design for nonlinear models using policy gradient reinforcement learning
Shen, Wanggang
Huan, Xun
[J]. COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2023, 416
[7] Gnu-RL: A Precocial Reinforcement Learning Solution for Building HVAC Control Using a Differentiable MPC Policy
Chen, Bingqing
Cai, Zicheng
Berges, Mario
[J]. BUILDSYS'19: PROCEEDINGS OF THE 6TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILDINGS, CITIES, AND TRANSPORTATION, 2019, : 316 - 325
[8] Gnu-RL: A Practical and Scalable Reinforcement Learning Solution for Building HVAC Control Using a Differentiable MPC Policy
Chen, Bingqing
Cai, Zicheng
Berges, Mario
[J]. FRONTIERS IN BUILT ENVIRONMENT, 2020, 6
[9] Automatic composition of Guzheng (Chinese Zither) music using long short-term memory network (LSTM) and reinforcement learning (RL)
Chen, Shuling
Zhong, Yong
Du, Ruxu
[J]. SCIENTIFIC REPORTS, 2022, 12 (01)
[10] Automatic composition of Guzheng (Chinese Zither) music using long short-term memory network (LSTM) and reinforcement learning (RL)
Shuling Chen
Yong Zhong
Ruxu Du
[J]. Scientific Reports, 12

← 1 2 →