Finite Horizon Stochastic Optimal Control of Nonlinear Two-Player Zero-Sum Games under Communication Constraint

被引:0
|
作者
Xu, Hao [1 ]
Jagannathan, S. [1 ]
机构
[1] Missouri Univ Sci & Technol, Dept Elect & Comp Engn, Rolla, MO 65409 USA
关键词
SYSTEMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, the finite horizon stochastic optimal control of nonlinear two-player zero-sum games, referred to as Nonlinear Networked Control Systems (NNCS) two-player zero-sum game, between control and disturbance input players in the presence of unknown system dynamics and a communication network with delays and packet losses is addressed by using neuro dynamic programming (NDP). The overall objective being to find the optimal control input while maximizing the disturbance attenuation. First, a novel online neural network (NN) identifier is introduced to estimate the unknown control and disturbance coefficient matrices which are needed in the generation of optimal control input. Then, the critic and two actor NNs have been introduced to learn the time-varying solution to the Hamilton-Jacobi-Isaacs (HJI) equation and determine the stochastic optimal control and disturbance policies in a forward-in-time manner. Eventually, with the proposed novel NN weight update laws, Lyapunov theory is utilized to demonstrate that all closed-loop signals and NN weights are uniformly ultimately bounded (UUB) during the finite horizon with ultimate bounds being a function of initial conditions and final time. Further, the approximated control input and disturbance signals tend close to the saddle-point equilibrium within finite-time. Simulation results are included.
引用
收藏
页码:239 / 244
页数:6
相关论文
共 50 条
  • [31] Online Solution of Nonlinear Two-Player Zero-Sum Games Using Synchronous Policy Iteration
    Vamvoudakis, Kyriakos G.
    Lewis, F. L.
    49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 3040 - 3047
  • [32] Adaptive Learning Based Output-Feedback Optimal Control of CT Two-Player Zero-Sum Games
    Zhao, Jun
    Lv, Yongfeng
    Zhao, Ziliang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (03) : 1437 - 1441
  • [33] GPI-Based design for partially unknown nonlinear two-player zero-sum games
    Yu, Lin
    Xiong, Junlin
    Xie, Min
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2023, 360 (03): : 2068 - 2088
  • [34] Online solution of nonlinear two-player zero-sum games using synchronous policy iteration
    Vamvoudakis, Kyriakos G.
    Lewis, F.L.
    International Journal of Robust and Nonlinear Control, 2012, 22 (13): : 1460 - 1483
  • [35] Online solution of nonlinear two-player zero-sum games using synchronous policy iteration
    Vamvoudakis, Kyriakos G.
    Lewis, F. L.
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2012, 22 (13) : 1460 - 1483
  • [36] Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games
    Ling, Chun Kai
    Fang, Fei
    Kolterl, J. Zico
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6104 - 6111
  • [37] An LP Approach for Solving Two-Player Zero-Sum Repeated Bayesian Games
    Li, Lichun
    Langbort, Cedric
    Shamma, Jeff
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (09) : 3716 - 3731
  • [38] Improved saddle point prediction in stochastic two-player zero-sum games with a deep learning approach
    Wu, Dawen
    Lisser, Abdel
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [39] Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games
    Zhao, Yulai
    Tian, Yuandong
    Lee, Jason D.
    Du, Simon S.
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [40] A METHOD TO SOLVE TWO-PLAYER ZERO-SUM MATRIX GAMES IN CHAOTIC ENVIRONMENT
    Khalifa, Hamiden Abd El-Wahed
    Kumar, Pavan
    INDEPENDENT JOURNAL OF MANAGEMENT & PRODUCTION, 2021, 12 (01): : 115 - 126