Fast Learning in an Actor-Critic Architecture with Reward and Punishment

被引：0

作者：

Balkenius, Christian ^{[1
]}

Winberg, Stefan ^{[1
]}

机构：

[1] Lund Univ Cognit Sci, SE-22222 Lund, Sweden

来源：

TENTH SCANDINAVIAN CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2008年 / 173卷

关键词：

Reinforcement learning; reward; punishment; generalization;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A reinforcement architecture is introduced that consists of three complementary learning systems with different generalization abilities. The ACTOR learns state-action associations. the CRITIC learns a goal-gradient, and the PUNISH system learns what actions to avoid. The architecture is compared to the standard actor-crititc and Q-learning models on a number of maze learning tasks. The novel architecture is shown to be superior on all the test mazes. Moreover, it shows how it is possible to combine several learning systems with different properties in a coherent reinforcement learning framework.

引用

页码：20 / 27

页数：8

共 50 条

[1] DAC: The Double Actor-Critic Architecture for Learning Options
Zhang, Shangtong
Whiteson, Shimon
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[2] Looking Back on the Actor-Critic Architecture
Barto, Andrew G.
Sutton, Richard S.
Anderson, Charles W.
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2021, 51 (01): : 40 - 50
[3] Reward-Punishment Actor-Critic Algorithm Applying to Robotic Non-grasping Manipulation
Kobayashi, Taisuke
Aotani, Takumi
Guadarrama-Olvera, Julio Rogelio
Dean-Leon, Emmanuel
Cheng, Gordon
[J]. 2019 JOINT IEEE 9TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (ICDL-EPIROB), 2019, : 37 - 42
[4] ACRE: Actor-Critic with Reward-Preserving Exploration
Kapoutsis, Athanasios Ch.
Koutras, Dimitrios I.
Korkas, Christos D.
Kosmatopoulos, Elias B.
[J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (30): : 22563 - 22576
[5] Robust Reward-Free Actor-Critic for Cooperative Multiagent Reinforcement Learning
Lin, Qifeng
Ling, Qing
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 12
[6] ACRE: Actor-Critic with Reward-Preserving Exploration
Athanasios Ch. Kapoutsis
Dimitrios I. Koutras
Christos D. Korkas
Elias B. Kosmatopoulos
[J]. Neural Computing and Applications, 2023, 35 : 22563 - 22576
[7] Intermittent Communications in Decentralized Shadow Reward Actor-Critic
Bedi, Amrit Singh
Koppel, Alec
Wang, Mengdi
Zhang, Junyu
[J]. 2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 2613 - 2620
[8] Granular computing in actor-critic learning
Peters, James F.
[J]. 2007 IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTATIONAL INTELLIGENCE, VOLS 1 AND 2, 2007, : 59 - 64
[9] Fully distributed actor-critic architecture for multitask deep reinforcement learning
Valcarcel Macua, Sergio
Davies, Ian
Tukiainen, Aleksi
De Cote, Enrique Munoz
[J]. KNOWLEDGE ENGINEERING REVIEW, 2021, 36
[10] Multi-actor mechanism for actor-critic reinforcement learning
Li, Lin
Li, Yuze
Wei, Wei
Zhang, Yujia
Liang, Jiye
[J]. INFORMATION SCIENCES, 2023, 647

← 1 2 3 4 5 →