Fast Learning in an Actor-Critic Architecture with Reward and Punishment

被引:0
|
作者
Balkenius, Christian [1 ]
Winberg, Stefan [1 ]
机构
[1] Lund Univ Cognit Sci, SE-22222 Lund, Sweden
关键词
Reinforcement learning; reward; punishment; generalization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A reinforcement architecture is introduced that consists of three complementary learning systems with different generalization abilities. The ACTOR learns state-action associations. the CRITIC learns a goal-gradient, and the PUNISH system learns what actions to avoid. The architecture is compared to the standard actor-crititc and Q-learning models on a number of maze learning tasks. The novel architecture is shown to be superior on all the test mazes. Moreover, it shows how it is possible to combine several learning systems with different properties in a coherent reinforcement learning framework.
引用
收藏
页码:20 / 27
页数:8
相关论文
共 50 条
  • [1] DAC: The Double Actor-Critic Architecture for Learning Options
    Zhang, Shangtong
    Whiteson, Shimon
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [2] Looking Back on the Actor-Critic Architecture
    Barto, Andrew G.
    Sutton, Richard S.
    Anderson, Charles W.
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2021, 51 (01): : 40 - 50
  • [3] Reward-Punishment Actor-Critic Algorithm Applying to Robotic Non-grasping Manipulation
    Kobayashi, Taisuke
    Aotani, Takumi
    Guadarrama-Olvera, Julio Rogelio
    Dean-Leon, Emmanuel
    Cheng, Gordon
    [J]. 2019 JOINT IEEE 9TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (ICDL-EPIROB), 2019, : 37 - 42
  • [4] ACRE: Actor-Critic with Reward-Preserving Exploration
    Kapoutsis, Athanasios Ch.
    Koutras, Dimitrios I.
    Korkas, Christos D.
    Kosmatopoulos, Elias B.
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (30): : 22563 - 22576
  • [5] Robust Reward-Free Actor-Critic for Cooperative Multiagent Reinforcement Learning
    Lin, Qifeng
    Ling, Qing
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 12
  • [6] ACRE: Actor-Critic with Reward-Preserving Exploration
    Athanasios Ch. Kapoutsis
    Dimitrios I. Koutras
    Christos D. Korkas
    Elias B. Kosmatopoulos
    [J]. Neural Computing and Applications, 2023, 35 : 22563 - 22576
  • [7] Intermittent Communications in Decentralized Shadow Reward Actor-Critic
    Bedi, Amrit Singh
    Koppel, Alec
    Wang, Mengdi
    Zhang, Junyu
    [J]. 2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 2613 - 2620
  • [8] Granular computing in actor-critic learning
    Peters, James F.
    [J]. 2007 IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTATIONAL INTELLIGENCE, VOLS 1 AND 2, 2007, : 59 - 64
  • [9] Fully distributed actor-critic architecture for multitask deep reinforcement learning
    Valcarcel Macua, Sergio
    Davies, Ian
    Tukiainen, Aleksi
    De Cote, Enrique Munoz
    [J]. KNOWLEDGE ENGINEERING REVIEW, 2021, 36
  • [10] Multi-actor mechanism for actor-critic reinforcement learning
    Li, Lin
    Li, Yuze
    Wei, Wei
    Zhang, Yujia
    Liang, Jiye
    [J]. INFORMATION SCIENCES, 2023, 647