Model-free Reinforcement Learning for Stochastic Stackelberg Security Games

被引:0
|
作者
Mishra, Rajesh K.
Vasal, Deepanshu
Vishwanath, Sriram
机构
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we consider a sequential stochastic Stackelberg game with two players, a leader, and a follower. The follower observes the state of the system privately while the leader does not. Players play Stackelberg equilibrium where the follower plays best response to the leader's strategy. In such a scenario, the leader has the advantage of committing to a policy that maximizes its returns given the knowledge that the follower is going to play the best response to its policy. Such a pair of strategies of both the players is defined as Stackelberg equilibrium of the game. Recently, [1] provided a sequential decomposition algorithm to compute the Stackelberg equilibrium for such games which allow for the computation of Markovian equilibrium policies in linear time as opposed to double exponential, as before. In this paper, we extend that idea to the case when the state update dynamics are not known to the players, to propose an reinforcement learning (RL) algorithm based on Expected Sarsa that learns the Stackelberg equilibrium policy by simulating a model of the underlying Markov decision process (MDP). We use particle filters to estimate the belief update for a common agent that computes the optimal policy based on the information which is common to both the players. We present a security game example to illustrate the policy learned by our algorithm.
引用
收藏
页码:348 / 353
页数:6
相关论文
共 50 条
  • [1] A Model-Free Solution for Stackelberg Games Using Reinforcement Learning and Projection Approaches
    Abouheaf, Mohammed
    Gueaieb, Wail
    Miah, Suruz
    Abdelhameed, Esam H.
    [J]. 2024 IEEE INTERNATIONAL SYMPOSIUM ON ROBOTIC AND SENSORS ENVIRONMENTS, ROSE 2024, 2024,
  • [2] Model-Free Reinforcement Learning for Stochastic Games with Linear Temporal Logic Objectives
    Bozkurt, Alper Kamil
    Wang, Yu
    Zavlanos, Michael M.
    Pajic, Miroslav
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 10649 - 10655
  • [3] Model-Free Reinforcement Learning for Mean Field Games
    Mishra, Rajesh
    Vasal, Deepanshu
    Vishwanath, Sriram
    [J]. IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (04): : 2141 - 2151
  • [4] Model-Free Reinforcement Learning of Impedance Control in Stochastic Environments
    Stulp, Freek
    Buchli, Jonas
    Ellmer, Alice
    Mistry, Michael
    Theodorou, Evangelos A.
    Schaal, Stefan
    [J]. IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, 2012, 4 (04) : 330 - 341
  • [5] Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems
    Cai, Tianchi
    Bao, Shenliao
    Jiang, Jiyan
    Zhou, Shiji
    Zhang, Wenpeng
    Gu, Lihong
    Gu, Jinjie
    Zhang, Guannan
    [J]. PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2179 - 2183
  • [6] Stackelberg games for model-free continuous-time stochastic systems based on adaptive dynamic programming
    Liu, Xikui
    Ge, Yingying
    Li, Yan
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2019, 363
  • [7] Model-free Reinforcement Learning for Non-stationary Mean Field Games
    Mishra, Rajesh K.
    Vasal, Deepanshu
    Vishwanath, Sriram
    [J]. 2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 1032 - 1037
  • [8] From deterministic to stochastic: an interpretable stochastic model-free reinforcement learning framework for portfolio optimization
    Song, Zitao
    Wang, Yining
    Qian, Pin
    Song, Sifan
    Coenen, Frans
    Jiang, Zhengyong
    Su, Jionglong
    [J]. APPLIED INTELLIGENCE, 2023, 53 (12) : 15188 - 15203
  • [9] From deterministic to stochastic: an interpretable stochastic model-free reinforcement learning framework for portfolio optimization
    Zitao Song
    Yining Wang
    Pin Qian
    Sifan Song
    Frans Coenen
    Zhengyong Jiang
    Jionglong Su
    [J]. Applied Intelligence, 2023, 53 : 15188 - 15203
  • [10] Discrete-time dynamic graphical games: model-free reinforcement learning solution
    Abouheaf M.I.
    Lewis F.L.
    Mahmoud M.S.
    Mikulski D.G.
    [J]. Control theory technol., 1 (55-69): : 55 - 69