Predictive representations can link model-based reinforcement learning to model-free mechanisms

被引:133
|
作者
Russek, Evan M. [1 ]
Momennejad, Ida [2 ,3 ]
Botvinick, Matthew M. [4 ,5 ]
Gershman, Samuel J. [6 ,7 ]
Daw, Nathaniel D. [2 ,3 ]
机构
[1] NYU, Ctr Neural Sci, New York, NY 10003 USA
[2] Princeton Univ, Princeton Neurosci Inst, Princeton, NJ 08544 USA
[3] Princeton Univ, Dept Psychol, Princeton, NJ 08544 USA
[4] DeepMind, London, England
[5] UCL, Gatsby Computat Neurosci Unit, London, England
[6] Harvard Univ, Dept Psychol, 33 Kirkland St, Cambridge, MA 02138 USA
[7] Harvard Univ, Ctr Brain Sci, Cambridge, MA 02138 USA
基金
美国国家卫生研究院;
关键词
BASAL GANGLIA; PREFRONTAL CORTEX; COGNITIVE MAP; SUCCESSOR REPRESENTATION; ORBITOFRONTAL CORTEX; DOPAMINE NEURONS; TASK; REWARD; HIPPOCAMPUS; EXPERIENCE;
D O I
10.1371/journal.pcbi.1005768
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Humans and animals are capable of evaluating actions by considering their long-run future rewards through a process described using model-based reinforcement learning (RL) algorithms. The mechanisms by which neural circuits perform the computations prescribed by model-based RL remain largely unknown; however, multiple lines of evidence suggest that neural circuits supporting model-based behavior are structurally homologous to and overlapping with those thought to carry out model-free temporal difference (TD) learning. Here, we lay out a family of approaches by which model-based computation may be built upon a core of TD learning. The foundation of this framework is the successor representation, a predictive state representation that, when combined with TD learning of value predictions, can produce a subset of the behaviors associated with model-based learning, while requiring less decision-time computation than dynamic programming. Using simulations, we delineate the precise behavioral capabilities enabled by evaluating actions using this approach, and compare them to those demonstrated by biological organisms. We then introduce two new algorithms that build upon the successor representation while progressively mitigating its limitations. Because this framework can account for the full range of observed putatively model-based behaviors while still utilizing a core TD framework, we suggest that it represents a neurally plausible family of mechanisms for model-based evaluation.
引用
收藏
页数:35
相关论文
共 50 条
  • [1] Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics
    Massi, Elisa
    Barthelemy, Jeanne
    Mailly, Juliane
    Dromnelle, Remi
    Canitrot, Julien
    Poniatowski, Esther
    Girard, Benoit
    Khamassi, Mehdi
    [J]. FRONTIERS IN NEUROROBOTICS, 2022, 16
  • [2] Model-based and Model-free Reinforcement Learning for Visual Servoing
    Farahmand, Amir Massoud
    Shademan, Azad
    Jagersand, Martin
    Szepesvari, Csaba
    [J]. ICRA: 2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-7, 2009, : 4135 - 4142
  • [3] Model-Based and Model-Free Mechanisms of Human Motor Learning
    Haith, Adrian M.
    Krakauer, John W.
    [J]. PROGRESS IN MOTOR CONTROL: NEURAL, COMPUTATIONAL AND DYNAMIC APPROACHES, 2013, 782 : 1 - 21
  • [4] Expert Initialized Hybrid Model-Based and Model-Free Reinforcement Learning
    Langaa, Jeppe
    Sloth, Christoffer
    [J]. 2023 EUROPEAN CONTROL CONFERENCE, ECC, 2023,
  • [5] Hybrid control for combining model-based and model-free reinforcement learning
    Pinosky, Allison
    Abraham, Ian
    Broad, Alexander
    Argall, Brenna
    Murphey, Todd D.
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2023, 42 (06): : 337 - 355
  • [6] Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning
    Swazinna, Phillip
    Udluft, Steffen
    Hein, Daniel
    Runkler, Thomas
    [J]. IFAC PAPERSONLINE, 2022, 55 (15): : 19 - 26
  • [7] Learning Representations in Model-Free Hierarchical Reinforcement Learning
    Rafati, Jacob
    Noelle, David C.
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 10009 - 10010
  • [8] EEG-based classification of learning strategies : model-based and model-free reinforcement learning
    Kim, Dongjae
    Weston, Charles
    Lee, Sang Wan
    [J]. 2018 6TH INTERNATIONAL CONFERENCE ON BRAIN-COMPUTER INTERFACE (BCI), 2018, : 146 - 148
  • [9] Parallel model-based and model-free reinforcement learning for card sorting performance
    Steinke, Alexander
    Lange, Florian
    Kopp, Bruno
    [J]. SCIENTIFIC REPORTS, 2020, 10 (01)
  • [10] Successor Features Combine Elements of Model-Free and Model-based Reinforcement Learning
    Lehnert, Lucas
    Littman, Michael L.
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21