Fast Learning in an Actor-Critic Architecture with Reward and Punishment

被引:0
|
作者
Balkenius, Christian [1 ]
Winberg, Stefan [1 ]
机构
[1] Lund Univ Cognit Sci, SE-22222 Lund, Sweden
关键词
Reinforcement learning; reward; punishment; generalization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A reinforcement architecture is introduced that consists of three complementary learning systems with different generalization abilities. The ACTOR learns state-action associations. the CRITIC learns a goal-gradient, and the PUNISH system learns what actions to avoid. The architecture is compared to the standard actor-crititc and Q-learning models on a number of maze learning tasks. The novel architecture is shown to be superior on all the test mazes. Moreover, it shows how it is possible to combine several learning systems with different properties in a coherent reinforcement learning framework.
引用
收藏
页码:20 / 27
页数:8
相关论文
共 50 条
  • [41] Distributed Actor-Critic Learning Using Emphatic Weightings
    Stankovic, Milos S.
    Beko, Marko
    Stankovic, Srdjan S.
    [J]. 2022 8TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'22), 2022, : 1167 - 1172
  • [42] Deep Actor-Critic Reinforcement Learning for Anomaly Detection
    Zhong, Chen
    Gursoy, M. Cenk
    Velipasalar, Senem
    [J]. 2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
  • [43] MARS: Malleable Actor-Critic Reinforcement Learning Scheduler
    Baheri, Betis
    Tronge, Jacob
    Fang, Bo
    Li, Ang
    Chaudhary, Vipin
    Guan, Qiang
    [J]. 2022 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE, IPCCC, 2022,
  • [44] Actor-critic learning based on fuzzy inference system
    Jouffe, L
    [J]. INFORMATION INTELLIGENCE AND SYSTEMS, VOLS 1-4, 1996, : 339 - 344
  • [45] Exploring Policy Diversity in Parallel Actor-Critic Learning
    Zhang, Yanqiang
    Zhai, Yuanzhao
    Zhou, Gongqian
    Ding, Bo
    Feng, Dawei
    Liu, Songwang
    [J]. 2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 1196 - 1203
  • [46] Learning State Representation for Deep Actor-Critic Control
    Munk, Jelle
    Kober, Jens
    Babuska, Robert
    [J]. 2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 4667 - 4673
  • [47] Averaged Soft Actor-Critic for Deep Reinforcement Learning
    Ding, Feng
    Ma, Guanfeng
    Chen, Zhikui
    Gao, Jing
    Li, Peng
    [J]. COMPLEXITY, 2021, 2021
  • [48] Multi-Agent Reinforcement Learning with General Utilities via Decentralized Shadow Reward Actor-Critic
    Zhang, Junyu
    Bedi, Amrit Singh
    Wang, Mengdi
    Koppel, Alec
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9031 - 9039
  • [49] COMMON-SENSICAL INCENTIVE REWARD IN DEEP ACTOR-CRITIC REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION
    Sendari, Siti
    Muladi
    Ardiyansyah, Firman
    Setumin, Samsul
    Mokhtar, Norrima Binti
    Lin, Hsien-, I
    Hartono, Pitoyo
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2024, 20 (02): : 373 - 389
  • [50] Variational actor-critic algorithms*,**
    Zhu, Yuhua
    Ying, Lexing
    [J]. ESAIM-CONTROL OPTIMISATION AND CALCULUS OF VARIATIONS, 2023, 29