Aversion to Option Loss in a Restless Bandit Task

被引:0
|
作者
Navarro D.J. [1 ]
Tran P. [1 ]
Baz N. [1 ]
机构
[1] School of Psychology, University of New South Wales, Sydney, 2052, NSW
基金
澳大利亚研究理事会;
关键词
Bandit tasks; Dynamic environments; Loss aversion; Reinforcement learning; Sequential decision making;
D O I
10.1007/s42113-018-0010-8
中图分类号
学科分类号
摘要
In everyday life, people need to make choices without full information about the environment, which poses an explore-exploit dilemma in which one must balance the need to learn about the world and the need to obtain rewards from it. The explore-exploit dilemma is often studied using the multi-armed restless bandit task, in which people repeatedly select from multiple options, and human behaviour is modelled as a form of reinforcement learning via Kalman filters. Inspired by work in the judgment and decision-making literature, we present two experiments using multi-armed bandit tasks in both static and dynamic environments, in situations where options can become unviable and vanish if they are not pursued. A Kalman filter model using Thompson sampling provides an excellent account of human learning in a standard restless bandit task, but there are systematic departures in the vanishing bandit task. We explore the nature of this loss aversion signal and consider theoretical explanations for the results. © 2018, Springer Nature Switzerland AG.
引用
收藏
页码:151 / 164
页数:13
相关论文
共 50 条
  • [1] Approximations of the restless bandit problem
    Department of Mathematics and Statistics, Lancaster University, Lancaster, United Kingdom
    J. Mach. Learn. Res.,
  • [2] Approximations of the Restless Bandit Problem
    Grunewalder, Steffen
    Khaleghi, Azadeh
    JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [3] Task assignment under agent loss aversion
    Daido, Kohei
    Morita, Kimiyuki
    Murooka, Takeshi
    Ogawa, Hiromasa
    ECONOMICS LETTERS, 2013, 121 (01) : 35 - 38
  • [4] Loss aversion and rationality in the newsvendor problem under recourse option
    Vipin, B.
    Amit, R. K.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2017, 261 (02) : 563 - 571
  • [5] The effects of endowment and loss aversion in managerial stock option valuation
    Devers, Cynthia E.
    Wiseman, Robert M.
    Holmes, R. Michael, Jr.
    ACADEMY OF MANAGEMENT JOURNAL, 2007, 50 (01): : 191 - 208
  • [6] Weighted Restless Bandit and Its Applications
    Wan, Peng-Jun
    Xu, Xiaohua
    2015 IEEE 35TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, 2015, : 507 - 516
  • [7] SIMULATION-DRIVEN TASK PRIORITIZATION USING A RESTLESS BANDIT MODEL FOR ACTIVE SONAR MISSIONS
    Wakayama, Cherry Y.
    Zabinsky, Zelda B.
    2015 WINTER SIMULATION CONFERENCE (WSC), 2015, : 3725 - 3736
  • [8] Approximation Algorithms for Restless Bandit Problems
    Guha, Sudipto
    Munagala, Kamesh
    Shi, Peng
    PROCEEDINGS OF THE TWENTIETH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2009, : 28 - +
  • [9] Uncertainty and Exploration in a Restless Bandit Problem
    Speekenbrink, Maarten
    Konstantinidis, Emmanouil
    TOPICS IN COGNITIVE SCIENCE, 2015, 7 (02) : 351 - 367
  • [10] Approximation Algorithms for Restless Bandit Problems
    Guha, Sudipto
    Munagala, Kamesh
    Shi, Peng
    JOURNAL OF THE ACM, 2010, 58 (01)