Hybrid Reward Architecture for Reinforcement Learning

被引:0
|
作者
van Seijen, Harm [1 ]
Fatemi, Mehdi [1 ]
Romoff, Joshua [1 ,2 ]
Laroche, Romain [1 ]
Barnes, Tavian [1 ]
Tsang, Jeffrey [1 ]
机构
[1] Microsoft Maluuba, Montreal, PQ, Canada
[2] McGill Univ, Montreal, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the main challenges in reinforcement learning (RL) is generalisation. In typical deep RL methods this is achieved by approximating the optimal value function with a low-dimensional representation using a deep network. While this approach works well in many domains, in domains where the optimal value function cannot easily be reduced to a low-dimensional representation, learning can be very slow and unstable. This paper contributes towards tackling such challenging domains, by proposing a new method, called Hybrid Reward Architecture (HRA). HRA takes as input a decomposed reward function and learns a separate value function for each component reward function. Because each component typically only depends on a subset of all features, the corresponding value function can be approximated more easily by a low-dimensional representation, enabling more effective learning. We demonstrate HRA on a toy-problem and the Atari game Ms. Pac-Man, where HRA achieves above-human performance.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Reinforcement Learning with Reward Shaping and Hybrid Exploration in Sparse Reward Scenes
    Yang, Yulong
    Cao, Weihua
    Guo, Linwei
    Gan, Chao
    Wu, Min
    [J]. 2023 IEEE 6TH INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER-PHYSICAL SYSTEMS, ICPS, 2023,
  • [2] Reward Shaping from Hybrid Systems Models in Reinforcement Learning
    Qian, Marian
    Mitsch, Stefan
    [J]. NASA FORMAL METHODS, NFM 2023, 2023, 13903 : 122 - 139
  • [3] Deep Reinforcement Learning by Parallelizing Reward and Punishment using the MaxPain Architecture
    Wang, Jiexin
    Elfwing, Stefan
    Uchibe, Eiji
    [J]. 2018 JOINT IEEE 8TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (ICDL-EPIROB), 2018, : 175 - 180
  • [4] Multi-Reward Architecture based Reinforcement Learning for Highway Driving Policies
    Yuan, Wei
    Yang, Ming
    He, Yuesheng
    Wang, Chunxiang
    Wang, Bing
    [J]. 2019 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2019, : 3810 - 3815
  • [5] Reward Reports for Reinforcement Learning
    Gilbert, Thomas Krendl
    Lambert, Nathan
    Dean, Sarah
    Zick, Tom
    Snoswell, Aaron
    Mehta, Soham
    [J]. PROCEEDINGS OF THE 2023 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2023, 2023, : 84 - 130
  • [6] Reward, motivation, and reinforcement learning
    Dayan, P
    Balleine, BW
    [J]. NEURON, 2002, 36 (02) : 285 - 298
  • [7] A hybrid agent architecture integrating desire, intention and reinforcement learning
    Tan, Ah-Hwee
    Ong, Yew-Soon
    Tapanuj, Akejariyawong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (07) : 8477 - 8487
  • [8] Time-Varying Weights in Multi-Reward Architecture for Deep Reinforcement Learning
    Xu, Meng
    Chen, Xinhong
    She, Yechao
    Jin, Yang
    Wang, Jianping
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (02): : 1865 - 1881
  • [9] Information Directed Reward Learning for Reinforcement Learning
    Lindner, David
    Turchetta, Matteo
    Tschiatschek, Sebastian
    Ciosek, Kamil
    Krause, Andreas
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [10] Reinforcement learning reward functions for unsupervised learning
    Fyfe, Colin
    Lai, Pei Ling
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 1, PROCEEDINGS, 2007, 4491 : 397 - +