SMONAC: Supervised Multiobjective Negative Actor-Critic for Sequential Recommendation

被引:1
|
作者
Zhou, Fei [1 ]
Luo, Biao [1 ]
Wu, Zhengke [1 ]
Huang, Tingwen [2 ]
机构
[1] Cent South Univ, Sch Automat, Changsha 410000, Peoples R China
[2] Texas A&M Univ Qatar, Dept Sci, Doha, Qatar
基金
中国国家自然科学基金;
关键词
Actor-critic; reinforcement learning (RL); sequential recommendation system (SRS); supervised learning;
D O I
10.1109/TNNLS.2023.3317353
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent research shows that the sole accuracy metric may lead to the homogeneous and repetitive recommendations for users and affect the long-term user engagement. Multiobjective reinforcement learning (RL) is a promising method to achieve a good balance in multiple objectives, including accuracy, diversity, and novelty. However, it has two deficiencies: neglecting the updating of negative action Q values and limited regulation from the RL Q-networks to the (self-)supervised learning recommendation network. To address these disadvantages, we develop the supervised multiobjective negative actor-critic (SMONAC) algorithm, which includes a negative action update mechanism and multiobjective actor-critic mechanism. For the negative action update mechanism, several negative actions are randomly sampled during each time updating, and then, the offline RL approach is utilized to learn their Q values. For the multiobjective actor-critic mechanism, accuracy, diversity, and novelty Q values are integrated into the scalarized Q value, which is used to criticize the supervised learning recommendation network. The comparative experiments are conducted on two real-world datasets, and the results demonstrate that the developed SMONAC achieves tremendous performance promotion, especially for the metrics of diversity and novelty.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [1] Supervised Advantage Actor-Critic for Recommender Systems
    Xin, Xin
    Karatzoglou, Alexandros
    Arapakis, Ioannis
    Jose, Joemon M.
    [J]. WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 1186 - 1196
  • [2] A supervised Actor-Critic approach for adaptive cruise control
    Zhao, Dongbin
    Wang, Bin
    Liu, Derong
    [J]. SOFT COMPUTING, 2013, 17 (11) : 2089 - 2099
  • [3] An Actor-Critic Hierarchical Reinforcement Learning Model for Course Recommendation
    Liang, Kun
    Zhang, Guoqiang
    Guo, Jinhui
    Li, Wentao
    [J]. ELECTRONICS, 2023, 12 (24)
  • [4] A Sample-Efficient Actor-Critic Algorithm for Recommendation Diversification
    Li, Shuang
    Yan, Yanghui
    Ren, Ju
    Zhou, Yuezhi
    Zhang, Yaoxue
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2020, 29 (01) : 89 - 96
  • [5] A Sample-Efficient Actor-Critic Algorithm for Recommendation Diversification
    LI Shuang
    YAN Yanghui
    REN Ju
    ZHOU Yuezhi
    ZHANG Yaoxue
    [J]. Chinese Journal of Electronics, 2020, 29 (01) : 89 - 96
  • [6] Actor-critic algorithms
    Konda, VR
    Tsitsiklis, JN
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1008 - 1014
  • [7] On actor-critic algorithms
    Konda, VR
    Tsitsiklis, JN
    [J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2003, 42 (04) : 1143 - 1166
  • [8] Natural Actor-Critic
    Peters, Jan
    Schaal, Stefan
    [J]. NEUROCOMPUTING, 2008, 71 (7-9) : 1180 - 1190
  • [9] Natural Actor-Critic
    Peters, J
    Vijayakumar, S
    Schaal, S
    [J]. MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 280 - 291
  • [10] An Actor-Critic Algorithm With Second-Order Actor and Critic
    Wang, Jing
    Paschalidis, Ioannis Ch.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (06) : 2689 - 2703