Trial-by-trial dynamics of reward prediction error-associated signals during extinction learning and renewal

被引：16

作者：

Packheiser, Julian ^{[1
]}

Donoso, Jose R. ^{[2
]}

Cheng, Sen ^{[2
]}

Guentuerkuen, Onur ^{[1
]}

Pusch, Roland ^{[1
]}

机构：

[1] Ruhr Univ Bochum, Fac Psychol, Dept Biopsychol, Univ Str 150, D-44780 Bochum, Germany

[2] Ruhr Univ Bochum, Inst Neural Computat, Univ Str 150, D-44780 Bochum, Germany

来源：

PROGRESS IN NEUROBIOLOGY | 2021年 / 197卷

关键词：

Reward prediction error; Extinction learning; Renewal; Trial-by-trial learning; Electrophysiology; DOPAMINE NEURONS ENCODE; PIGEON COLUMBA-LIVIA; PREFRONTAL CORTEX; NIDOPALLIUM CAUDOLATERALE; VARIABILITY; MICRODRIVE; RESPONSES; CONTEXT;

D O I：

10.1016/j.pneurobio.2020.101901

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

Reward prediction errors (RPEs) have been suggested to drive associative learning processes, but their precise temporal dynamics at the single-neuron level remain elusive. Here, we studied the neural correlates of RPEs, focusing on their trial-by-trial dynamics during an operant extinction learning paradigm. Within a single behavioral session, pigeons went through acquisition, extinction and renewal the context-dependent response recovery after extinction. We recorded single units from the avian prefrontal cortex analogue, the nidopallium caudolaterale (NCL) and found that the omission of reward during extinction led to a peak of population activity that moved backwards in time as trials progressed. The chronological order of these signal changes during the progress of learning was indicative of temporal shifts of RPE signals that started during reward omission and then moved backwards to the presentation of the conditioned stimulus. Switches from operant choices to avoidance behavior (and vice versa) coincided with changes in population activity during the animals' decision-making. On the single unit level, we found more diverse patterns where some neurons' activity correlated with RPE signals whereas others correlated with the absolute value during the outcome period. Finally, we demonstrated that mere sensory contextual changes during the renewal test were sufficient to elicit signals likely associated with RPEs. Thus, RPEs are truly expectancy-driven since they can be elicited by changes in reward expectation, without an actual change in the quality or quantity of reward.

引用

页数：13

共 50 条

[31] Acute stress blunts prediction error signals in the dorsal striatum during reinforcement learning
Carvalheiro, Joana
Conceicao, Vasco A.
Mesquita, Ana
Seara-Cardoso, Ana
[J]. NEUROBIOLOGY OF STRESS, 2021, 15
[32] Choice modulates the neural dynamics of prediction error processing during rewarded learning
Peterson, David A.
Lotz, Daniel T.
Halgren, Eric
Sejnowski, Terrence J.
Poizner, Howard
[J]. NEUROIMAGE, 2011, 54 (02) : 1385 - 1394
[33] Reward prediction error signaling during reinforcement learning in social anxiety disorder is altered by social observation
Becker, M.
Peterburs, J.
Voegler, R.
Hofmann, D.
Bellebaum, C.
Straube, T.
[J]. JOURNAL OF NEURAL TRANSMISSION, 2019, 126 (11) : 1549 - 1550
[34] THE INTERACTION OF TYPE OF CHOICE PROCEDURE WITH AMOUNT OF PRACTICE IN TRIAL-AND-ERROR LEARNING UNDER 2 REWARD CONDITIONS
NOBLE, CE
ALCOCK, WT
NOBLE, JL
[J]. JOURNAL OF PSYCHOLOGY, 1958, 46 (02): : 295 - 301
[35] Enhancement of brain activation during trial-and-error sequence learning in early PD
Mentis, MJ
Dhawan, V
Nakamura, T
Ghilardi, M
Feigin, A
Edwards, C
Ghez, C
Eidelberg, D
[J]. NEUROLOGY, 2003, 60 (04) : 612 - 619
[36] Dissociating the contributions of reward-prediction errors to trial-level adaptation and long-term learning
Lohse, K. R.
Miller, M. W.
Daou, M.
Valerius, W.
Jones, M.
[J]. BIOLOGICAL PSYCHOLOGY, 2020, 149
[37] Single trial coupling of Purkinje cell activity to speed and error signals during circular manual tracking
Roitman, A. V.
Pasalar, S.
Ebner, T. J.
[J]. EXPERIMENTAL BRAIN RESEARCH, 2009, 192 (02) : 241 - 251
[38] Single trial coupling of Purkinje cell activity to speed and error signals during circular manual tracking
A. V. Roitman
S. Pasalar
T. J. Ebner
[J]. Experimental Brain Research, 2009, 192 : 241 - 251
[39] Inflammatory Responses to Stress are Associated with Altered Prediction Error Signaling During Reinforcement Learning
Treadway, Michael
[J]. NEUROPSYCHOPHARMACOLOGY, 2015, 40 : S36 - S36
[40] Single-trial modeling separates multiple overlapping prediction errors during reward processing in human EEG
Hoy, Colin W.
Steiner, Sheila C.
Knight, Robert T.
[J]. COMMUNICATIONS BIOLOGY, 2021, 4 (01)

← 1 2 3 4 5 →