Application of an Off-Policy Reinforcement Learning Algorithm for H∞ Control Design of Nonlinear Structural Systems With Completely Unknown Dynamics

被引：0

作者：

Amirmojahedi, M. ^{[1
]}

Mojoodi, A. ^{[2
]}

Shojaee, Saeed ^{[1
]}

Hamzehei-Javaran, Saleh ^{[1
]}

机构：

[1] Shahid Bahonar Univ Kerman, Civil Engn Dept, Kerman, Iran

[2] Amirkabir Univ Technol, Dept Mech Engn, Tehran, Iran

来源：

EARTHQUAKE ENGINEERING & STRUCTURAL DYNAMICS | 2025年 / 54卷 / 04期

关键词：

H-infinity control; two-player zero-sum game; online reinforcement learning; nonlinear building; neural networks; ZERO-SUM GAMES; TRACKING CONTROL; FEEDBACK;

D O I：

10.1002/eqe.4299

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

This paper proposes a model-free and online off-policy algorithm based on reinforcement learning (RL) for vibration attenuation of earthquake-excited structures, through designing an optimal H-infinity controller. This design relies on solving a two-player zero-sum game theory with a Hamilton-Jacobi-Isaacs (HJI) equation, which is extremely difficult, or often impossible, to be solved for the value function and the related optimal controller. The proposed strategy uses an actor-critic-disturbance structure to learn the solution of the HJI equation online and forward in time, without requiring any knowledge of the system dynamics. In addition, the control and disturbance policies and value function are approximated by the actor, the disturbance, and the critic neural networks (NNs), respectively. Implementing the policy iteration technique, the NNs' weights of the proposed model are calculated using the least square (LS) method in each iteration. In the present study, the convergence of the proposed algorithm is investigated through two distinct examples. Furthermore, the performance of this off-policy RL strategy is studied in reducing the response of a seismically excited nonlinear structure with an active mass damper (AMD) for two cases of state feedback. The simulation results prove the effectiveness of the proposed algorithm in application to civil engineering structures.

引用

页码：1210 / 1228

页数：19

共 50 条

[1] H∞ Tracking Control of Completely Unknown Continuous-Time Systems via Off-Policy Reinforcement Learning
Modares, Hamidreza
Lewis, Frank L.
Jiang, Zhong-Ping
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (10) : 2550 - 2562
[2] Off-Policy Reinforcement Learning for H∞ Control Design
Luo, Biao
Wu, Huai-Ning
Huang, Tingwen
IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (01) : 65 - 76
[3] Off-policy reinforcement learning for tracking control of discrete-time Markov jump linear systems with completely unknown dynamics
Huang Z.
Tu Y.
Fang H.
Wang H.
Zhang L.
Shi K.
He S.
Journal of the Franklin Institute, 2023, 360 (03) : 2361 - 2378
[4] Off-policy reinforcement learning algorithm for robust optimal control of uncertain nonlinear systems
Amirparast, Ali
Kamal Hosseini Sani, S.
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024, 34 (08) : 5419 - 5437
[5] Off-policy algorithm based Hierarchical optimal control for completely unknown dynamic systems
Cui, Xiaohong
Chen, Jiayu
Wang, Binrui
Xu, Suan
NEUROCOMPUTING, 2022, 488 : 669 - 680
[6] Online Off-Policy Reinforcement Learning for Optimal Control of Unknown Nonlinear Systems Using Neural Networks
Zhu, Liao
Wei, Qinglai
Guo, Ping
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (08): : 5112 - 5122
[7] Control of an AUV with completely unknown dynamics and multi-asymmetric input constraints via off-policy reinforcement learning
Mohammadi, Mehdi
Arefi, Mohammad Mehdi
Vafamand, Navid
Kaynak, Okyay
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (07): : 5255 - 5265
[8] Control of an AUV with completely unknown dynamics and multi-asymmetric input constraints via off-policy reinforcement learning
Mehdi Mohammadi
Mohammad Mehdi Arefi
Navid Vafamand
Okyay Kaynak
Neural Computing and Applications, 2022, 34 : 5255 - 5265
[9] Synchronous optimal control method for nonlinear systems with saturating actuators and unknown dynamics using off-policy integral reinforcement learning
Zhang, Zenglian
Song, Ruizhuo
Cao, Min
NEUROCOMPUTING, 2019, 356 : 162 - 169
[10] Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control
Farahmandi, Alireza
Reitz, Brian
Debord, Mark
Philbrick, Douglas
Estabridis, Katia
Hewer, Gary
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211

← 1 2 3 4 5 →