Bayesian Incentive-Compatible Bandit Exploration

被引:16
|
作者
Mansour, Yishay [1 ]
Slivkins, Aleksandrs [2 ]
Syrgkanis, Vasilis [3 ]
机构
[1] Tel Aviv Univ, Sch Comp Sci, IL-6997801 Tel Aviv, Israel
[2] Microsoft Res, New York, NY 10011 USA
[3] Microsoft Res, Cambridge, MA 02142 USA
关键词
mechanism design; multiarmed bandits; regret; Bayesian incentive-compatibility; CLINICAL-TRIAL DESIGN; MULTIARMED BANDIT; ALGORITHMS; SIGNATURE; REGRET; BOUNDS; ERA;
D O I
10.1287/opre.2019.1949
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
As self-interested individuals ("agents") make decisions over time, they utilize information revealed by other agents in the past and produce information that may help agents in the future. This phenomenon is common in a wide range of scenarios in the Internet economy, as well as in medical decisions. Each agent would like to exploit: select the best action given the current information, but would prefer the previous agents to explore: try out various alternatives to collect information. A social planner, by means of a carefully designed recommendation policy, can incentivize the agents to balance the exploration and exploitation so as to maximize social welfare. We model the planner's recommendation policy as a multiarm bandit algorithm under incentive-compatibility constraints induced by agents' Bayesian priors. We design a bandit algorithm which is incentive-compatible and has asymptotically optimal performance, as expressed by regret. Further, we provide a black-box reduction from an arbitrary multiarm bandit algorithm to an incentive-compatible one, with only a constant multiplicative increase in regret. This reduction works for very general bandit settings that incorporate contexts and arbitrary partial feedback.
引用
收藏
页码:1132 / 1161
页数:30
相关论文
共 50 条
  • [31] BRICK: Asynchronous Incentive-Compatible Payment Channels
    Avarikioti, Zeta
    Kokoris-Kogias, Eleftherios
    Wattenhofer, Roger
    Zindros, Dionysis
    FINANCIAL CRYPTOGRAPHY AND DATA SECURITY, FC 2021, PT II, 2021, 12675 : 209 - 230
  • [32] Incentive-Compatible Opportunistic Routing for Wireless Networks
    Wu, Fan
    Chen, Tingting
    Zhong, Sheng
    Li, Li Erran
    Yang, Yang Richard
    MOBICOM'08: PROCEEDINGS OF THE FOURTEENTH ACM INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, 2008, : 303 - +
  • [33] An Incentive-Compatible Smart Contract for Decentralized Commerce
    Schwartzbach, Nikolaj, I
    2021 IEEE INTERNATIONAL CONFERENCE ON BLOCKCHAIN AND CRYPTOCURRENCY (ICBC), 2021,
  • [34] Incentive-Compatible Interdomain Routing with Linear Utilities
    Hall, Alexander
    Nikolova, Evdokia
    Papadimitriou, Christos
    INTERNET MATHEMATICS, 2008, 5 (04) : 395 - 410
  • [35] Incentive-compatible interdomain routing with linear utilities
    Hall, Alexander
    Nikolova, Evdokia
    Papadimitriou, Christos
    INTERNET AND NETWORK ECONOMICS, PROCEEDINGS, 2007, 4858 : 232 - +
  • [36] Optimal incentive-compatible mechanisms in active systems
    Enaleev, A. K.
    AUTOMATION AND REMOTE CONTROL, 2013, 74 (03) : 491 - 505
  • [37] INCENTIVE-COMPATIBLE COST-ALLOCATION SCHEMES
    SCHMEIDLER, D
    TAUMAN, Y
    JOURNAL OF ECONOMIC THEORY, 1994, 63 (02) : 189 - 207
  • [38] Incentive-compatible online auctions for digital goods
    Bar-Yossef, Z
    Hildrum, K
    Wu, F
    PROCEEDINGS OF THE THIRTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2002, : 964 - 970
  • [39] On the Foundations of Ex Post Incentive-Compatible Mechanisms
    Yamashita, Takuro
    Zhu, Shuguang
    AMERICAN ECONOMIC JOURNAL-MICROECONOMICS, 2022, 14 (04) : 494 - 514
  • [40] No-Regret and Incentive-Compatible Online Learning
    Freeman, Rupert
    Pennock, David M.
    Podimata, Chara
    Vaughan, Jennifer Wortman
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,