Bayesian Incentive-Compatible Bandit Exploration

被引:16
|
作者
Mansour, Yishay [1 ]
Slivkins, Aleksandrs [2 ]
Syrgkanis, Vasilis [3 ]
机构
[1] Tel Aviv Univ, Sch Comp Sci, IL-6997801 Tel Aviv, Israel
[2] Microsoft Res, New York, NY 10011 USA
[3] Microsoft Res, Cambridge, MA 02142 USA
关键词
mechanism design; multiarmed bandits; regret; Bayesian incentive-compatibility; CLINICAL-TRIAL DESIGN; MULTIARMED BANDIT; ALGORITHMS; SIGNATURE; REGRET; BOUNDS; ERA;
D O I
10.1287/opre.2019.1949
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
As self-interested individuals ("agents") make decisions over time, they utilize information revealed by other agents in the past and produce information that may help agents in the future. This phenomenon is common in a wide range of scenarios in the Internet economy, as well as in medical decisions. Each agent would like to exploit: select the best action given the current information, but would prefer the previous agents to explore: try out various alternatives to collect information. A social planner, by means of a carefully designed recommendation policy, can incentivize the agents to balance the exploration and exploitation so as to maximize social welfare. We model the planner's recommendation policy as a multiarm bandit algorithm under incentive-compatibility constraints induced by agents' Bayesian priors. We design a bandit algorithm which is incentive-compatible and has asymptotically optimal performance, as expressed by regret. Further, we provide a black-box reduction from an arbitrary multiarm bandit algorithm to an incentive-compatible one, with only a constant multiplicative increase in regret. This reduction works for very general bandit settings that incorporate contexts and arbitrary partial feedback.
引用
收藏
页码:1132 / 1161
页数:30
相关论文
共 50 条
  • [21] An Incentive-Compatible Condorcet Jury Theorem
    Laslier, Jean-Francois
    Weibull, Jorgen W.
    SCANDINAVIAN JOURNAL OF ECONOMICS, 2013, 115 (01): : 84 - 108
  • [22] Incentive-Compatible Reimbursement Schemes for Physicians
    Emons, Winand
    JOURNAL OF INSTITUTIONAL AND THEORETICAL ECONOMICS-ZEITSCHRIFT FUR DIE GESAMTE STAATSWISSENSCHAFT, 2013, 169 (04): : 605 - 620
  • [23] Building an incentive-compatible safety net
    Calomiris, CW
    JOURNAL OF BANKING & FINANCE, 1999, 23 (10) : 1499 - 1519
  • [24] Efficient and Incentive-Compatible Liver Exchange
    Ergin, Haluk
    Sonmez, Tayfun
    Unver, M. Utku
    ECONOMETRICA, 2020, 88 (03) : 965 - 1005
  • [25] AN INCENTIVE-COMPATIBLE MODIFICATION OF THE HEAL ALGORITHM
    SERVI, LD
    HO, YC
    OPTIMAL CONTROL APPLICATIONS & METHODS, 1983, 4 (03): : 265 - 267
  • [26] Incentive-compatible contracts for the sale of information
    Biais, B
    Germain, L
    REVIEW OF FINANCIAL STUDIES, 2002, 15 (04): : 987 - 1003
  • [27] Ordinal Bayesian incentive-compatible voting rules with correlated belief under betweenness property
    Bose, Abhigyan
    Roy, Souvik
    ECONOMICS LETTERS, 2023, 229
  • [28] Optimal incentive-compatible mechanisms in active systems
    A. K. Enaleev
    Automation and Remote Control, 2013, 74 : 491 - 505
  • [29] OPTIMAL INCENTIVE-COMPATIBLE INSURANCE WITH BACKGROUND RISK
    Chi, Yichun
    Tan, Ken Seng
    ASTIN BULLETIN, 2021, 51 (02): : 661 - 688
  • [30] An Incentive-Compatible Mechanism for Decentralized Storage Network
    Vakilinia, Iman
    Wang, Weihong
    Xin, Jiajun
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2023, 10 (04): : 2294 - 2306