Neurornodulatory adaptive combination of correlation-based learning in cerebellum and reward-based learning in basal ganglia for goal-directed behavior control

被引:20
|
作者
Dasgupta, Sakyasingha [1 ,2 ]
Woergoetter, Florentin [1 ,2 ]
Manoonpong, Poramate [2 ,3 ]
机构
[1] Univ Gottingen, Inst Phys Biophys, D-37077 Gottingen, Germany
[2] Univ Gottingen, Bernstein Ctr Computat Neurosci, D-37077 Gottingen, Germany
[3] Univ Southern Denmark, Maersk Mc Kinney Moller Inst, Ctr Biorobot, Odense, Denmark
关键词
decision making; recurrent neural networks; basal ganglia; cerebellum; operant conditioning; classical conditioning; neuromodulation; correlation learning; SUPPLEMENTARY MOTOR AREA; ACTION SELECTION; HETEROSYNAPTIC MODULATION; MODEL; TIME; MECHANISMS; PLASTICITY; OPERANT; CORTEX; MEMORY;
D O I
10.3389/fncir.2014.00126
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Goal-directed decision making in biological systems is broadly based on associations between conditional and unconditional stimuli. This can be further classified as classical conditioning (correlation-based learning) and operant conditioning (reward-based learning). A number of computational and experimental studies have well established the role of the basal ganglia in reward-based learning, where as the cerebellum plays an important role in developing specific conditioned responses. Although viewed as distinct learning systems, recent animal experiments point toward their complementary role in behavioral learning, and also show the existence of substantial two-way communication between these two brain structures. Based on this notion of co-operative learning, in this paper we hypothesize that the basal ganglia and cerebellar learning systems work in parallel and interact with each other. We envision that such an interaction is influenced by reward modulated heterosynaptic plasticity (RMHP) rule at the thalamus, guiding the overall goal directed behavior. Using a recurrent neural network actor-critic model of the basal ganglia and a feed-forward correlation-based learning model of the cerebellum, we demonstrate that the RMHP rule can effectively balance the outcomes of the two learning systems. This is tested using simulated environments of increasing complexity with a four-wheeled robot in a foraging task in both static and dynamic configurations. Although modeled with a simplified level of biological abstraction, we clearly demonstrate that such a RMHP induced combinatorial learning mechanism, leads to stabler and faster learning of goal-directed behaviors, in comparison to the individual systems. Thus, in this paper we provide a computational model for adaptive combination of the basal ganglia and cerebellum learning systems by way of neuromodulated plasticity for goal-directed decision making in biological and bio-mimetic organisms.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Path-Integral-Based Reinforcement Learning Algorithm for Goal-Directed Locomotion of Snake-Shaped Robot
    Qi Yongqiang
    Yang Hailan
    Rong Dan
    Ke Yi
    Lu Dongchen
    Li Chunyang
    Liu Xiaoting
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2021, 2021
  • [42] Loss of dysbindin-1 in mice impairs reward-based operant learning by increasing impulsive and compulsive behavior
    Carr, Gregory V.
    Jenkins, Kimberly A.
    Weinberger, Daniel R.
    Papaleo, Francesco
    BEHAVIOURAL BRAIN RESEARCH, 2013, 241 : 173 - 184
  • [43] Adaptive reward shaping based reinforcement learning for docking control of autonomous underwater vehicles
    Chu, Shuguang
    Lin, Mingwei
    Li, Dejun
    Lin, Ri
    Xiao, Sa
    OCEAN ENGINEERING, 2025, 318
  • [44] A deep learning wearable-based solution for continuous at-home monitoring of upper limb goal-directed movements
    Nunes, Adonay S.
    Potter, Ilkay Yildiz
    Mishra, Ram Kinker
    Bonato, Paolo
    Vaziri, Ashkan
    FRONTIERS IN NEUROLOGY, 2024, 14
  • [45] An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making, reward prediction, and punishment learning
    Balasubramani, Pragathi P.
    Chakravarthy, V. Srinivasa
    Ravindran, Balaraman
    Moustafa, Ahmed A.
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2014, 8
  • [46] Neurocognitive mechanisms of cognitive control: The role of prefrontal cortex in action selection, response inhibition, performance monitoring, and reward-based learning
    Ridderinkhof, KR
    van den Wildenberg, WPM
    Segalowitz, SJ
    Carter, CS
    BRAIN AND COGNITION, 2004, 56 (02) : 129 - 140
  • [47] Adaptive traffic signal control system using composite reward architecture based deep reinforcement learning
    Jamil, Abu Rafe Md
    Ganguly, Kishan Kumar
    Nower, Naushin
    IET INTELLIGENT TRANSPORT SYSTEMS, 2020, 14 (14) : 2030 - 2041
  • [48] A Sub-symbolic Process Underlying the Usage-based Acquisition of a Compositional Representation: Results of Robotic Learning Experiments of Goal-directed Actions
    Sugita, Yuuya
    Tani, Jun
    2008 IEEE 7TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, 2008, : 127 - 132
  • [49] Self-Adaptive Within- and Consensus-View Correlation-Based Multi-View Multi-Label Learning
    Zhu, Changming
    Xue, Sicheng
    IEEE SENSORS JOURNAL, 2024, 24 (23) : 39260 - 39269
  • [50] Interactive Control of Computational Power in a Model of the Basal Ganglia-Thalamocortical Circuit by a Supervised Attractor-Based Learning Procedure
    Cabessa, Jeremie
    Villa, Alessandro E. P.
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2017, PT I, 2017, 10613 : 334 - 342