Neural Combinatorial Learning of Goal-directed Behavior with Reservoir Critic and Reward Modulated Hebbian Plasticity

被引:3
|
作者
Dasgupta, Sakyasingha [1 ]
Woergoetter, Florentin [1 ]
Morimoto, Jun [2 ]
Manoonpong, Poramate [1 ]
机构
[1] Univ Gottingen, BCCN, Friedrich Hund Pl 1, D-37077 Gottingen, Germany
[2] ATR Computat Neurosci Lab, Kyoto 6190288, Japan
关键词
Re-inforcement learning; Reservoir networks; Correlation learning; Temporal memory;
D O I
10.1109/SMC.2013.174
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Learning of goal-directed behaviors in biological systems is broadly based on associations between conditional and unconditional stimuli. This can be further classified as classical conditioning (correlation-based learning) and operant conditioning (reward-based learning). Although traditionally modeled as separate learning systems in artificial agents, numerous animal experiments point towards their co-operative role in behavioral learning. Based on this concept, the recently introduced framework of neural combinatorial learning combines the two systems where both the systems run in parallel to guide the overall learned behavior. Such a combinatorial learning demonstrates a faster and efficient learner. In this work, we further improve the framework by applying a reservoir computing network (RC) as an adaptive critic unit and reward modulated Hebbian plasticity. Using a mobile robot system for goal-directed behavior learning, we clearly demonstrate that the reservoir critic outperforms traditional radial basis function (RBF) critics in terms of stability of convergence and learning time. Furthermore the temporal memory in RC allows the system to learn partially observable markov decision process scenario, in contrast to a memoryless RBF critic.
引用
收藏
页码:993 / 1000
页数:8
相关论文
共 50 条
  • [31] How does color distribution learning affect goal-directed visuomotor behavior?
    Entzmann, Lea
    Asgeirsson, Arni Gunnar
    Kristjansson, Arni
    COGNITION, 2025, 254
  • [32] Calling for help is independently modulated by brain systems underlying goal-directed behavior and threat perception
    Fox, AS
    Oakes, TR
    Shelton, SE
    Converse, AK
    Davidson, RJ
    Kalin, NH
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (11) : 4176 - 4179
  • [33] Prefrontal cortical cell firing during maintenance, extinction, and reinstatement of goal-directed behavior for natural reward
    Peters, YM
    O'Donnell, P
    Carelli, RM
    SYNAPSE, 2005, 56 (02) : 74 - 83
  • [34] Involvement of the prelimbic area of the rodent in goal directed behavior and reward devaluation learning
    Hernádi, I
    Kovács, P
    Veisenberger, E
    EUROPEAN JOURNAL OF NEUROSCIENCE, 2000, 12 : 84 - 84
  • [35] Goal-Directed Behavior and Instrumental Devaluation: A Neural System-Level Computational Model
    Mannella, Francesco
    Mirolli, Marco
    Baldassarre, Gianluca
    FRONTIERS IN BEHAVIORAL NEUROSCIENCE, 2016, 10
  • [36] Reward-Based Learning Drives Rapid Sensory Signals in Medial Prefrontal Cortex and Dorsal Hippocampus Necessary for Goal-Directed Behavior
    Le Merre, Pierre
    Esmaeili, Vahid
    Charriere, Eloise
    Galan, Katia
    Salin, Paul-A.
    Petersen, Carl C. H.
    Crochet, Sylvain
    NEURON, 2018, 97 (01) : 83 - +
  • [37] Learning and executing goal-directed choices by internally generated sequences in spiking neural circuits
    Palmer, John
    Keane, Adam
    Gong, Pulin
    PLOS COMPUTATIONAL BIOLOGY, 2017, 13 (07)
  • [38] Frontostriatal mechanisms in instruction-based learning as a hallmark of flexible goal-directed behavior
    Wolfensteller, Uta
    Ruge, Hannes
    FRONTIERS IN PSYCHOLOGY, 2012, 3
  • [39] Goal-directed vs. habitual instrumental behavior during reward processing in anorexia nervosa: an fMRI study
    Steding, Julius
    Boehm, Ilka
    King, Joseph A.
    Geisler, Daniel
    Ritschel, Franziska
    Seidel, Maria
    Doose, Arne
    Jaite, Charlotte
    Roessner, Veit
    Smolka, Michael N.
    Ehrlich, Stefan
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [40] Activation of Astrocytes in the Dorsomedial Striatum Facilitates Transition from Habitual to Goal-Directed Reward-Seeking Behavior
    Kang, Seungwoo
    Hong, Sa-Ik
    Lee, Jeyeon
    Peyton, Lee
    Choi, Sun
    Kim, Hyunjung
    Chang, Su-Youne
    Choi, Doo-Sup
    FASEB JOURNAL, 2020, 34