Application of an Off-Policy Reinforcement Learning Algorithm for H∞ Control Design of Nonlinear Structural Systems With Completely Unknown Dynamics

被引:0
|
作者
Amirmojahedi, M. [1 ]
Mojoodi, A. [2 ]
Shojaee, Saeed [1 ]
Hamzehei-Javaran, Saleh [1 ]
机构
[1] Shahid Bahonar Univ Kerman, Civil Engn Dept, Kerman, Iran
[2] Amirkabir Univ Technol, Dept Mech Engn, Tehran, Iran
来源
EARTHQUAKE ENGINEERING & STRUCTURAL DYNAMICS | 2025年 / 54卷 / 04期
关键词
H-infinity control; two-player zero-sum game; online reinforcement learning; nonlinear building; neural networks; ZERO-SUM GAMES; TRACKING CONTROL; FEEDBACK;
D O I
10.1002/eqe.4299
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
This paper proposes a model-free and online off-policy algorithm based on reinforcement learning (RL) for vibration attenuation of earthquake-excited structures, through designing an optimal H-infinity controller. This design relies on solving a two-player zero-sum game theory with a Hamilton-Jacobi-Isaacs (HJI) equation, which is extremely difficult, or often impossible, to be solved for the value function and the related optimal controller. The proposed strategy uses an actor-critic-disturbance structure to learn the solution of the HJI equation online and forward in time, without requiring any knowledge of the system dynamics. In addition, the control and disturbance policies and value function are approximated by the actor, the disturbance, and the critic neural networks (NNs), respectively. Implementing the policy iteration technique, the NNs' weights of the proposed model are calculated using the least square (LS) method in each iteration. In the present study, the convergence of the proposed algorithm is investigated through two distinct examples. Furthermore, the performance of this off-policy RL strategy is studied in reducing the response of a seismically excited nonlinear structure with an active mass damper (AMD) for two cases of state feedback. The simulation results prove the effectiveness of the proposed algorithm in application to civil engineering structures.
引用
收藏
页码:1210 / 1228
页数:19
相关论文
共 50 条
  • [1] H∞ Tracking Control of Completely Unknown Continuous-Time Systems via Off-Policy Reinforcement Learning
    Modares, Hamidreza
    Lewis, Frank L.
    Jiang, Zhong-Ping
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (10) : 2550 - 2562
  • [2] Off-Policy Reinforcement Learning for H∞ Control Design
    Luo, Biao
    Wu, Huai-Ning
    Huang, Tingwen
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (01) : 65 - 76
  • [3] Off-policy reinforcement learning for tracking control of discrete-time Markov jump linear systems with completely unknown dynamics
    Huang Z.
    Tu Y.
    Fang H.
    Wang H.
    Zhang L.
    Shi K.
    He S.
    Journal of the Franklin Institute, 2023, 360 (03) : 2361 - 2378
  • [4] Off-policy reinforcement learning algorithm for robust optimal control of uncertain nonlinear systems
    Amirparast, Ali
    Kamal Hosseini Sani, S.
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024, 34 (08) : 5419 - 5437
  • [5] Off-policy algorithm based Hierarchical optimal control for completely unknown dynamic systems
    Cui, Xiaohong
    Chen, Jiayu
    Wang, Binrui
    Xu, Suan
    NEUROCOMPUTING, 2022, 488 : 669 - 680
  • [6] Online Off-Policy Reinforcement Learning for Optimal Control of Unknown Nonlinear Systems Using Neural Networks
    Zhu, Liao
    Wei, Qinglai
    Guo, Ping
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (08): : 5112 - 5122
  • [7] Control of an AUV with completely unknown dynamics and multi-asymmetric input constraints via off-policy reinforcement learning
    Mohammadi, Mehdi
    Arefi, Mohammad Mehdi
    Vafamand, Navid
    Kaynak, Okyay
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (07): : 5255 - 5265
  • [8] Control of an AUV with completely unknown dynamics and multi-asymmetric input constraints via off-policy reinforcement learning
    Mehdi Mohammadi
    Mohammad Mehdi Arefi
    Navid Vafamand
    Okyay Kaynak
    Neural Computing and Applications, 2022, 34 : 5255 - 5265
  • [9] Synchronous optimal control method for nonlinear systems with saturating actuators and unknown dynamics using off-policy integral reinforcement learning
    Zhang, Zenglian
    Song, Ruizhuo
    Cao, Min
    NEUROCOMPUTING, 2019, 356 : 162 - 169
  • [10] Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control
    Farahmandi, Alireza
    Reitz, Brian
    Debord, Mark
    Philbrick, Douglas
    Estabridis, Katia
    Hewer, Gary
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211