Learning reward frequency over reward probability: A tale of two learning rules

被引:9
|
作者
Don, Hilary J. [1 ]
Otto, A. Ross [2 ]
Cornwall, Astin C. [1 ]
Davis, Tyler [3 ]
Worthy, Darrell A. [1 ]
机构
[1] Texas A&M Univ, College Stn, TX 77843 USA
[2] McGill Univ, Montreal, PQ, Canada
[3] Texas Tech Univ, Lubbock, TX 79409 USA
关键词
Reinforcement learning; Delta rule; Decay rule; Reward frequency; Prediction error; Probability learning; DECISION-MAKING; PREDICTION ERRORS; WORKING-MEMORY; MODEL; AMBIGUITY; INFERENCE;
D O I
10.1016/j.cognition.2019.104042
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Learning about the expected value of choice alternatives associated with reward is critical for adaptive behavior. Although human choice preferences are affected by the presentation frequency of reward-related alternatives, this may not be captured by some dominant models of value learning, such as the delta rule. In this study, we examined whether reward learning is driven more by learning the probability of reward provided by each option, or how frequently each option has been rewarded, and assess how well models based on average reward (e.g. the delta model) and models based on cumulative reward (e.g. the decay model) can account for choice preferences. In a binary-outcome choice task, participants selected between pairs of options that had reward probabilities of 0.65 (A) versus 0.35 (B) or 0.75 (C) versus 0.25 (D). Crucially, during training there were twice the number of AB trials as CD trials, such that option A was associated with higher cumulative reward, while option C gave higher average reward. Participants then decided between novel combinations of options (e.g., AC). Most participants preferred option A over C, a result predicted by the Decay model, but not the Delta model. We also compared the Delta and Decay models to both more simplified as well as more complex models that assumed additional mechanisms, such as representation of uncertainty. Overall, models that assume learning about cumulative reward provided the best account of the data.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] LEARNING IN HONEYBEES AS A FUNCTION OF REWARD PROBABILITY
    COUVILLON, PA
    FISCHER, ME
    BITTERMAN, ME
    [J]. BULLETIN OF THE PSYCHONOMIC SOCIETY, 1992, 30 (06) : 445 - 445
  • [2] MAGNITUDE OF REWARD AND PROBABILITY-LEARNING
    BRACKBILL, Y
    STARR, RH
    KAPPY, MS
    [J]. JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 1962, 63 (01): : 32 - &
  • [3] REWARD AND AFTEREFFECTS OF REWARD IN LEARNING OF GOLDFISH
    MACKINTOSH, NJ
    [J]. JOURNAL OF COMPARATIVE AND PHYSIOLOGICAL PSYCHOLOGY, 1971, 76 (02) : 225 - +
  • [4] An fMRI study of reward-related probability learning
    Delgado, MR
    Miller, MM
    Inati, S
    Phelps, EA
    [J]. NEUROIMAGE, 2005, 24 (03) : 862 - 873
  • [5] AMOUNT OF REWARD AND RELATIVE FREQUENCY OF AMOUNT OF REWARD IN PAIRED-ASSOCIATE LEARNING
    KATZ, L
    [J]. CANADIAN JOURNAL OF PSYCHOLOGY, 1966, 20 (02): : 136 - 136
  • [6] Schedules of Reinforcement, Learning, and Frequency Reward Programs
    Craig, Adam
    Silk, Timothy
    [J]. ADVANCES IN CONSUMER RESEARCH, VOL XXXVI, 2009, 36 : 555 - 555
  • [7] LEARNING IN HONEYBEES AS A FUNCTION OF AMOUNT AND FREQUENCY OF REWARD
    BUCHANAN, GM
    BITTERMAN, ME
    [J]. ANIMAL LEARNING & BEHAVIOR, 1988, 16 (03): : 247 - 255
  • [8] Reward Prediction for Representation Learning and Reward Shaping
    Hlynsson, Hlynur David
    Wiskott, Laurenz
    [J]. PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE (IJCCI), 2021, : 267 - 276
  • [9] Enhancing reward learning in the absence of an effect on reward
    Browning, Michael
    [J]. BRAIN, 2023, 146 (09) : 3574 - 3575
  • [10] Synergistic effects of adaptive reward and reinforcement learning rules on cooperation
    Wang, Lu
    Fan, Litong
    Zhang, Long
    Zou, Rongcheng
    Wang, Zhen
    [J]. NEW JOURNAL OF PHYSICS, 2023, 25 (07):