Intrinsically motivated reinforcement learning based recommendation with counterfactual data augmentation

被引:2
|
作者
Chen, Xiaocong [1 ]
Wang, Siyu [1 ]
Qi, Lianyong [2 ]
Li, Yong [3 ]
Yao, Lina [1 ,4 ]
机构
[1] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW 2052, Australia
[2] China Univ Petr East China, Coll Comp Sci & Technol, Dong Ying Shi, Peoples R China
[3] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[4] CSIRO, Data 61, Eveleigh, NSW 2015, Australia
关键词
Recommender systems; Deep reinforcement learning; Counterfactual reasoning; CAPACITY;
D O I
10.1007/s11280-023-01187-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning (DRL) has shown promising results in modeling dynamic user preferences in RS in recent literature. However, training a DRL agent in the sparse RS environment poses a significant challenge. This is because the agent must balance between exploring informative user-item interaction trajectories and using existing trajectories for policy learning, a known exploration and exploitation trade-off. This trade-off greatly affects the recommendation performance when the environment is sparse. In DRL-based RS, balancing exploration and exploitation is even more challenging as the agent needs to deeply explore informative trajectories and efficiently exploit them in the context of RS. To address this issue, we propose a novel intrinsically motivated reinforcement learning (IMRL) method that enhances the agent's capability to explore informative interaction trajectories in the sparse environment. We further enrich these trajectories via an adaptive counterfactual augmentation strategy with a customised threshold to improve their efficiency in exploitation. Our approach is evaluated on six offline datasets and three online simulation platforms, demonstrating its superiority over existing state-of-the-art methods. The extensive experiments show that our IMRL method outperforms other methods in terms of recommendation performance in the sparse RS environment.
引用
收藏
页码:3253 / 3274
页数:22
相关论文
共 50 条
  • [1] Intrinsically motivated reinforcement learning based recommendation with counterfactual data augmentation
    Xiaocong Chen
    Siyu Wang
    Lianyong Qi
    Yong Li
    Lina Yao
    World Wide Web, 2023, 26 : 3253 - 3274
  • [2] Intrinsically Motivated NeuroEvolution for Vision-Based Reinforcement Learning
    Cuccu, Giuseppe
    Luciw, Matthew
    Schmidhuber, Juergen
    Gomez, Faustino
    2011 IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING (ICDL), 2011,
  • [3] Skill-based curiosity for intrinsically motivated reinforcement learning
    Nicolas Bougie
    Ryutaro Ichise
    Machine Learning, 2020, 109 : 493 - 512
  • [4] Skill-based curiosity for intrinsically motivated reinforcement learning
    Bougie, Nicolas
    Ichise, Ryutaro
    MACHINE LEARNING, 2020, 109 (03) : 493 - 512
  • [5] Intrinsically Motivated Lifelong Exploration in Reinforcement Learning
    Bougie, Nicolas
    Ichise, Ryutaro
    ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 1357 : 109 - 120
  • [6] Evolution and learning in an intrinsically motivated reinforcement learning robot
    Schembri, Massimiliano
    Mirolli, Marco
    Baldassarre, Gianhica
    ADVANCES IN ARTIFICIAL LIFE, PROCEEDINGS, 2007, 4648 : 294 - +
  • [7] Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective
    Singh, Satinder
    Lewis, Richard L.
    Barto, Andrew G.
    Sorg, Jonathan
    IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, 2010, 2 (02) : 70 - 82
  • [8] User Feedback-Based Counterfactual Data Augmentation for Sequential Recommendation
    Wang, Haiyang
    Chu, Yan
    Ning, Hui
    Wang, Zhengkui
    Shan, Wen
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT III, KSEM 2023, 2023, 14119 : 370 - 382
  • [9] Intrinsically Motivated Self-supervised Learning in Reinforcement Learning
    Zhao, Yue
    Du, Chenzhuang
    Zhao, Hang
    Li, Tiejun
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 3605 - 3615
  • [10] Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning
    Mohamed, Shakir
    Rezende, Danilo J.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28