Trial-by-trial dynamics of reward prediction error-associated signals during extinction learning and renewal

被引:16
|
作者
Packheiser, Julian [1 ]
Donoso, Jose R. [2 ]
Cheng, Sen [2 ]
Guentuerkuen, Onur [1 ]
Pusch, Roland [1 ]
机构
[1] Ruhr Univ Bochum, Fac Psychol, Dept Biopsychol, Univ Str 150, D-44780 Bochum, Germany
[2] Ruhr Univ Bochum, Inst Neural Computat, Univ Str 150, D-44780 Bochum, Germany
来源
PROGRESS IN NEUROBIOLOGY | 2021年 / 197卷
关键词
Reward prediction error; Extinction learning; Renewal; Trial-by-trial learning; Electrophysiology; DOPAMINE NEURONS ENCODE; PIGEON COLUMBA-LIVIA; PREFRONTAL CORTEX; NIDOPALLIUM CAUDOLATERALE; VARIABILITY; MICRODRIVE; RESPONSES; CONTEXT;
D O I
10.1016/j.pneurobio.2020.101901
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Reward prediction errors (RPEs) have been suggested to drive associative learning processes, but their precise temporal dynamics at the single-neuron level remain elusive. Here, we studied the neural correlates of RPEs, focusing on their trial-by-trial dynamics during an operant extinction learning paradigm. Within a single behavioral session, pigeons went through acquisition, extinction and renewal the context-dependent response recovery after extinction. We recorded single units from the avian prefrontal cortex analogue, the nidopallium caudolaterale (NCL) and found that the omission of reward during extinction led to a peak of population activity that moved backwards in time as trials progressed. The chronological order of these signal changes during the progress of learning was indicative of temporal shifts of RPE signals that started during reward omission and then moved backwards to the presentation of the conditioned stimulus. Switches from operant choices to avoidance behavior (and vice versa) coincided with changes in population activity during the animals' decision-making. On the single unit level, we found more diverse patterns where some neurons' activity correlated with RPE signals whereas others correlated with the absolute value during the outcome period. Finally, we demonstrated that mere sensory contextual changes during the renewal test were sufficient to elicit signals likely associated with RPEs. Thus, RPEs are truly expectancy-driven since they can be elicited by changes in reward expectation, without an actual change in the quality or quantity of reward.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Single-trial modeling separates multiple overlapping prediction errors during reward processing in human EEG
    Colin W. Hoy
    Sheila C. Steiner
    Robert T. Knight
    [J]. Communications Biology, 4
  • [42] Neural correlates of weighted reward prediction error during reinforcement learning classify response to cognitive behavioral therapy in depression
    Queirazza, Filippo
    Fouragnan, Elsa
    Steele, J. Douglas
    Cavanagh, Jonathan
    Philiastides, Marios G.
    [J]. SCIENCE ADVANCES, 2019, 5 (07):
  • [43] Neural activity in macaque prefrontal cortex during learning through trial-and-error behaviors
    Fujimoto, Atsushi
    Nishida, Satoshi
    Tanaka, Tomohiro
    Ogawa, Tadashi
    [J]. NEUROSCIENCE RESEARCH, 2010, 68 : E286 - E286
  • [44] Sex differences in vicarious trial-and-error behavior during radial arm maze learning
    Bimonte, HA
    Denenberg, VH
    [J]. PHYSIOLOGY & BEHAVIOR, 2000, 68 (04) : 495 - 499
  • [45] Abrupt, Asynchronous Changes in Action Representations by Anterior Cingulate Cortex Neurons during Trial and Error Learning
    Emberly, Eldon
    Seamans, Jeremy K.
    [J]. CEREBRAL CORTEX, 2020, 30 (08) : 4336 - 4345
  • [46] Functional network activity during errorless and trial-and-error color-name association learning
    Yamashita, Madoka
    Shimokawa, Tetsuya
    Peper, Ferdinand
    Tanemura, Rumi
    [J]. BRAIN AND BEHAVIOR, 2020, 10 (08):
  • [47] EFFECTS OF INTERTRIAL INTERVAL AND TRIAL-1 REWARD DURING ACQUISITION OF AN OBJECT-DISCRIMINATION LEARNING SET IN MONKEYS
    DEETS, AC
    HARLOW, HF
    BLOMQUIS.AJ
    [J]. JOURNAL OF COMPARATIVE AND PHYSIOLOGICAL PSYCHOLOGY, 1970, 73 (03): : 501 - &
  • [48] Learning marginal-cost pricing via a trial-and-error procedure with day-to-day flow dynamics
    Ye, Hongbo
    Yang, Hai
    Tan, Zhijia
    [J]. TRANSPORTATION RESEARCH PART B-METHODOLOGICAL, 2015, 81 : 794 - 807
  • [49] Learning Marginal-Cost Pricing via Trial-and-Error Procedure with Day-to-Day Flow Dynamics
    Ye, Hongbo
    Yang, Hai
    Tan, Zhijia
    [J]. 21ST INTERNATIONAL SYMPOSIUM ON TRANSPORTATION AND TRAFFIC THEORY, 2015, 7 : 362 - 380
  • [50] Dependence of neuronal activation and trial-and-error behaviour during new skill acquisition on prior learning history
    Svarnik, Olga
    Fadeeva, Tatiana
    Alexandrov, Yuri
    [J]. INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2008, 43 (3-4) : 606 - 606