Generalized reinforcement learning in perfect-information games

被引:0
|
作者
Maxwell Pak
Bing Xu
机构
[1] Southwestern University of Finance and Economics,Research Institute of Economics and Management
来源
关键词
Reinforcement learning; Extensive-form games; D83;
D O I
暂无
中图分类号
学科分类号
摘要
This paper studies reinforcement learning in which players base their action choice on valuations they have for the actions. We identify two general conditions on valuation updating rules that together guarantee that the probability of playing a subgame perfect Nash equilibrium (SPNE) converges to one in games where no player is indifferent between two outcomes without every other player being also indifferent. The same conditions guarantee that the fraction of times a SPNE is played converges to one almost surely. We also show that for additively separable valuations, in which valuations are the sum of empirical and error terms, the conditions guaranteeing convergence can be made more intuitive. In addition, we give four examples of valuations that satisfy our conditions. These examples represent different degrees of sophistication in learning behavior and include well-known examples of reinforcement learning.
引用
收藏
页码:985 / 1011
页数:26
相关论文
共 50 条
  • [1] Generalized reinforcement learning in perfect-information games
    Pak, Maxwell
    Xu, Bing
    [J]. INTERNATIONAL JOURNAL OF GAME THEORY, 2016, 45 (04) : 985 - 1011
  • [2] Perfect-Information Stochastic Games with Generalized Mean-Payoff Objectives
    Chatterjee, Krishnendu
    Doyen, Laurent
    [J]. PROCEEDINGS OF THE 31ST ANNUAL ACM-IEEE SYMPOSIUM ON LOGIC IN COMPUTER SCIENCE (LICS 2016), 2016, : 247 - 256
  • [3] Perfect-information stochastic parity games
    Zielonka, W
    [J]. FOUNDATIONS OF SOFTWARE SCIENCE AND COMPUTATION STRUCTURES, PROCEEDINGS, 2004, 2987 : 499 - 513
  • [4] A Perfect-Information Construction for Coordination in Games
    Berwanger, Dietmar
    Kaiser, Lukasz
    Puchala, Bernd
    [J]. IARCS ANNUAL CONFERENCE ON FOUNDATIONS OF SOFTWARE TECHNOLOGY AND THEORETICAL COMPUTER SCIENCE (FSTTCS 2011), 2011, 13 : 387 - 398
  • [5] Perfect-Information Games with Lower-Semicontinuous Payoffs
    Flesch, Janos
    Kuipers, Jeroen
    Mashiah-Yaakovi, Ayala
    Schoenmakers, Gijs
    Solan, Eilon
    Vrieze, Koos
    [J]. MATHEMATICS OF OPERATIONS RESEARCH, 2010, 35 (04) : 742 - 755
  • [6] Perfect-Information Stochastic Mean-Payoff Parity Games
    Chatterjee, Krishnendu
    Doyen, Laurent
    Gimbert, Hugo
    Oualhadj, Youssouf
    [J]. FOUNDATIONS OF SOFTWARE SCIENCE AND COMPUTATION STRUCTURES, 2014, 8412 : 210 - 225
  • [7] Rationality, Nash equilibrium and backwards induction in perfect-information games
    BenPorath, E
    [J]. REVIEW OF ECONOMIC STUDIES, 1997, 64 (01): : 23 - 46
  • [8] COMPLEXITY OF SOME 2-PERSON PERFECT-INFORMATION GAMES
    SCHAEFER, TJ
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1978, 16 (02) : 185 - 225
  • [9] Behavior and deliberation in perfect-information games: Nash equilibrium and backward induction
    Giacomo Bonanno
    [J]. International Journal of Game Theory, 2018, 47 : 1001 - 1032
  • [10] Behavior and deliberation in perfect-information games: Nash equilibrium and backward induction
    Bonanno, Giacomo
    [J]. INTERNATIONAL JOURNAL OF GAME THEORY, 2018, 47 (03) : 1001 - 1032