A Spiking Neural Network Model of an Actor-Critic Learning Agent

被引:64
|
作者
Potjans, Wiebke [1 ]
Morrison, Abigail [1 ]
Diesmann, Markus [1 ,2 ]
机构
[1] RIKEN, Brain Sci Inst, Computat Neurosci Grp, Wako, Saitama 3510198, Japan
[2] Univ Freiburg, Bernstein Ctr Computat Neurosci, D-79104 Freiburg, Germany
关键词
UNCERTAIN ENVIRONMENTS; DEPENDENT PLASTICITY; SYNAPTIC PLASTICITY; BASAL GANGLIA; REINFORCEMENT; DOPAMINE; TIME; PREDICTION; POTENTIATION; PROPAGATION;
D O I
10.1162/neco.2008.08-07-593
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ability to adapt behavior to maximize reward as a result of interactions with the environment is crucial for the survival of any higher organism. In the framework of reinforcement learning, temporal-difference learning algorithms provide an effective strategy for such goal-directed adaptation, but it is unclear to what extent these algorithms are compatible with neural computation. In this article, we present a spiking neural network model that implements actor-critic temporal-difference learning by combining local plasticity rules with a global reward signal. The network is capable of solving a nontrivial gridworld task with sparse rewards. We derive a quantitative mapping of plasticity parameters and synaptic weights to the corresponding variables in the standard algorithmic formulation and demonstrate that the network learns with a similar speed to its discrete time counterpart and attains the same equilibrium performance.
引用
收藏
页码:301 / 339
页数:39
相关论文
共 50 条
  • [1] An Implementation of Actor-Critic Algorithm on Spiking Neural Network Using Temporal Coding Method
    Lu, Junqi
    Wu, Xinning
    Cao, Su
    Wang, Xiangke
    Yu, Huangchao
    APPLIED SCIENCES-BASEL, 2022, 12 (20):
  • [2] A Low-Power Actor-Critic Framework Based on Memristive Spiking Neural Network
    Zhang, Yaozhong
    Hu, Xiaofang
    Zhou, Yue
    Song, Wenbo
    2018 4TH INTERNATIONAL CONFERENCE ON ENVIRONMENTAL SCIENCE AND MATERIAL APPLICATION, 2019, 252
  • [3] A fuzzy Actor-Critic reinforcement learning network
    Wang, Xue-Song
    Cheng, Yu-Hu
    Yi, Jian-Qiang
    INFORMATION SCIENCES, 2007, 177 (18) : 3764 - 3781
  • [4] A World Model for Actor-Critic in Reinforcement Learning
    Panov, A. I.
    Ugadiarov, L. A.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2023, 33 (03) : 467 - 477
  • [5] Multi-agent reinforcement learning by the actor-critic model with an attention interface
    Zhang, Lixiang
    Li, Jingchen
    Zhu, Yi'an
    Shi, Haobin
    Hwang, Kao-Shing
    NEUROCOMPUTING, 2022, 471 : 275 - 284
  • [6] Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning
    Xiao, Yuchen
    Tan, Weihao
    Amato, Christopher
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [7] Efficient Model Learning Methods for Actor-Critic Control
    Grondman, Ivo
    Vaandrager, Maarten
    Busoniu, Lucian
    Babuska, Robert
    Schuitema, Erik
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2012, 42 (03): : 591 - 602
  • [8] Reinforcement Learning Using a Continuous Time Actor-Critic Framework with Spiking Neurons
    Fremaux, Nicolas
    Sprekeler, Henning
    Gerstner, Wulfram
    PLOS COMPUTATIONAL BIOLOGY, 2013, 9 (04):
  • [9] Network Congestion Control Algorithm Based on Actor-Critic Reinforcement Learning Model
    Xu, Tao
    Gong, Lina
    Zhang, Wei
    Li, Xuhong
    Wang, Xia
    Pan, Wenwen
    ADVANCES IN MATERIALS, MACHINERY, ELECTRONICS II, 2018, 1955
  • [10] A Sandpile Model for Reliable Actor-Critic Reinforcement Learning
    Peng, Yiming
    Chen, Gang
    Zhang, Mengjie
    Pang, Shaoning
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 4014 - 4021