The gradient of the reinforcement landscape influences sensorimotor learning

被引:32
|
作者
Cashaback, Joshua G. A. [1 ,2 ]
Lao, Christopher K. [3 ]
Palidis, Dimitrios J. [4 ,5 ,6 ]
Coltman, Susan K. [4 ,5 ,6 ]
McGregor, Heather R. [4 ,5 ,6 ]
Gribble, Paul L. [3 ,5 ,6 ,7 ]
机构
[1] Univ Calgary, Human Performance Lab, Calgary, AB, Canada
[2] Univ Calgary, Hotchkiss Brain Inst, Calgary, AB, Canada
[3] Western Univ, Dept Physiol & Pharmacol, London, ON, Canada
[4] Western Univ, Grad Program Neurosci, London, ON, Canada
[5] Western Univ, Brain & Mind Inst, London, ON, Canada
[6] Western Univ, Dept Psychol, London, ON, Canada
[7] Haskins Labs Inc, New Haven, CT 06511 USA
基金
加拿大自然科学与工程研究理事会;
关键词
TASK-IRRELEVANT; DECISION-THEORY; MOTOR; ADAPTATION; MOVEMENT; VARIABILITY; REWARD; REPRESENTATION; MEMORY;
D O I
10.1371/journal.pcbi.1006839
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Consideration of previous successes and failures is essential to mastering a motor skill. Much of what we know about how humans and animals learn from such reinforcement feedback comes from experiments that involve sampling from a small number of discrete actions. Yet, it is less understood how we learn through reinforcement feedback when sampling from a continuous set of possible actions. Navigating a continuous set of possible actions likely requires using gradient information to maximize success. Here we addressed how humans adapt the aim of their hand when experiencing reinforcement feedback that was associated with a continuous set of possible actions. Specifically, we manipulated the change in the probability of reward given a change in motor actionthe reinforcement gradientto study its influence on learning. We found that participants learned faster when exposed to a steep gradient compared to a shallow gradient. Further, when initially positioned between a steep and a shallow gradient that rose in opposite directions, participants were more likely to ascend the steep gradient. We introduce a model that captures our results and several features of motor learning. Taken together, our work suggests that the sensorimotor system relies on temporally recent and spatially local gradient information to drive learning. Author summary In recent years it has been shown that reinforcement feedback may also subserve our ability to acquire new motor skills. Here we address how the reinforcement gradient influences motor learning. We found that a steeper gradient increased both the rate and likelihood of learning. Moreover, while many mainstream theories posit that we build a full representation of the reinforcement landscape, both our data and model suggest that the sensorimotor system relies primarily on temporally recent and spatially local gradient information to drive learning. Our work provides new insights into how we sample from a continuous action-reward landscape to maximize success.
引用
收藏
页数:27
相关论文
共 50 条
  • [21] A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
    Kim, Dong-Ki
    Liu, Miao
    Riemer, Matthew
    Sun, Chuangchuang
    Abdulhai, Marwa
    Habibi, Golnaz
    Lopez-Cot, Sebastian
    Tesauro, Gerald
    How, Jonathan P.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [22] A Sensorimotor Reinforcement Learning Framework for Physical Human-Robot Interaction
    Ghadirzadeh, Ali
    Butepage, Judith
    Maki, Atsuto
    Kragic, Danica
    Bjorkman, Marten
    2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), 2016, : 2682 - 2688
  • [23] Policy gradient reinforcement learning for fast quadrupedal locomotion
    Kohl, N
    Stone, P
    2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 2619 - 2624
  • [24] Fast Stochastic Kalman Gradient Descent for Reinforcement Learning
    Totaro, Simone
    Jonsson, Anders
    LEARNING FOR DYNAMICS AND CONTROL, VOL 144, 2021, 144
  • [25] Policy Gradient using Weak Derivatives for Reinforcement Learning
    Bhatt, Sujay
    Koppel, Alec
    Krishnamurthy, Vikram
    2019 53RD ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2019,
  • [26] On the use of the policy gradient and Hessian in inverse reinforcement learning
    Metelli, Alberto Maria
    Pirotta, Matteo
    Restelli, Marcello
    INTELLIGENZA ARTIFICIALE, 2020, 14 (01) : 117 - 150
  • [27] Direct gradient-based reinforcement learning for robot behavior learning
    El-Fakdi, Andres
    Carreras, Marc
    Ridao, Pere
    INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS II, 2007, : 175 - +
  • [28] Online reinforcement learning control via discontinuous gradient
    Arellano-Muro, Carlos A.
    Castillo-Toledo, Bernardino
    Di Gennaro, Stefano
    Loukianov, Alexander G.
    INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2024, 38 (05) : 1762 - 1776
  • [29] A Hybrid Stochastic Policy Gradient Algorithm for Reinforcement Learning
    Pham, Nhan H.
    Nguyen, Ylam M.
    Phan, Ydzung T.
    Zphuong Ha Nguyen
    van Dijk, Zxmarten
    Tran-Dinh, Quoc
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 374 - 384
  • [30] Independent Policy Gradient Methods for Competitive Reinforcement Learning
    Daskalakis, Constantinos
    Foster, Dylan J.
    Golowich, Noah
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33