Adaptive Learning Based Output-Feedback Optimal Control of CT Two-Player Zero-Sum Games

被引:17
|
作者
Zhao, Jun [1 ]
Lv, Yongfeng [2 ]
Zhao, Ziliang [3 ]
机构
[1] Shandong Univ Sci & Technol, Coll Mech & Elect Engn, Qingdao 266590, Peoples R China
[2] Taiyuan Univ Technol, Coll Elect & Power Engn, Taiyuan 030024, Peoples R China
[3] Shandong Univ Sci & Technol, Coll Transportat, Qingdao 266590, Peoples R China
基金
中国国家自然科学基金;
关键词
Games; Optimal control; Adaptive learning; Game theory; Cost function; Observers; Estimation error; Output-feedback optimal control; adaptive learning; zero-sum games; SYSTEMS;
D O I
10.1109/TCSII.2021.3112050
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Although optimal control with full state-feedback has been well studied, online solving output-feedback optimal control problem is difficult, in particular for learning online Nash equilibrium solution of the continuous-time (CT) two-player zero-sum differential games. For this purpose, we propose an adaptive learning algorithm to address this trick problem. A modified game algebraic Riccati equation (MGARE) is derived by tailoring its state-feedback control counterpart. An adaptive online learning method is proposed to approximate the solution to the MGARE through online data, where two operations (i.e., vectorization and Kronecker's product) can be adopted to reconstruct the MGARE. Only system output information is needed to implement developed learning algorithm. Simulation results are carried out to exemplify the proposed control and learning method.
引用
收藏
页码:1437 / 1441
页数:5
相关论文
共 50 条
  • [21] Finite Horizon Stochastic Optimal Control of Nonlinear Two-Player Zero-Sum Games under Communication Constraint
    Xu, Hao
    Jagannathan, S.
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 239 - 244
  • [22] A Generalized Minimax Q-Learning Algorithm for Two-Player Zero-Sum Stochastic Games
    Diddigi, Raghuram Bharadwaj
    Kamanchi, Chandramouli
    Bhatnagar, Shalabh
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (09) : 4816 - 4823
  • [23] The Lagging Anchor Algorithm: Reinforcement Learning in Two-Player Zero-Sum Games with Imperfect Information
    Fredrik A. Dahl
    Machine Learning, 2002, 49 : 5 - 37
  • [24] Learning Extensive-Form Perfect Equilibria in Two-Player Zero-Sum Sequential Games
    Bernasconi, Martino
    Marchesi, Alberto
    Trovo, Francesco
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [25] A Meta-evolutionary Learning Algorithm for Opponent Adaptation in Two-player Zero-sum Games
    Wu Z.
    Li K.
    Xu H.
    Xing J.-L.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (10): : 2462 - 2473
  • [26] The lagging anchor algorithm: Reinforcement learning in two-player zero-sum games with imperfect information
    Dahl, FA
    MACHINE LEARNING, 2002, 49 (01) : 5 - 37
  • [27] Equilibrium payoffs in repeated two-player zero-sum games of finite automata
    Baskov, O. V.
    INTERNATIONAL JOURNAL OF GAME THEORY, 2019, 48 (02) : 423 - 431
  • [28] An LP Approach for Solving Two-Player Zero-Sum Repeated Bayesian Games
    Li, Lichun
    Langbort, Cedric
    Shamma, Jeff
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (09) : 3716 - 3731
  • [29] A METHOD TO SOLVE TWO-PLAYER ZERO-SUM MATRIX GAMES IN CHAOTIC ENVIRONMENT
    Khalifa, Hamiden Abd El-Wahed
    Kumar, Pavan
    INDEPENDENT JOURNAL OF MANAGEMENT & PRODUCTION, 2021, 12 (01): : 115 - 126
  • [30] Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games
    Zhao, Yulai
    Tian, Yuandong
    Lee, Jason D.
    Du, Simon S.
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151