Sliding mode heading control for AUV based on continuous hybrid model-free and model-based reinforcement learning

被引：16

作者：

Wang, Dianrui ^{[1
]}

Shen, Yue ^{[1
]}

Wan, Junhe ^{[1
]}

Sha, Qixin ^{[1
]}

Li, Guangliang ^{[1
]}

Chen, Guanzhong ^{[1
]}

He, Bo ^{[1
]}

机构：

[1] Ocean Univ China, Sch Informat Sci & Engn, Qingdao 266000, Shandong, Peoples R China

来源：

APPLIED OCEAN RESEARCH | 2022年 / 118卷

基金：

中国国家自然科学基金;

关键词：

Autonomous underwater vehicle (AUV); Model-based reinforcement learning; Model-free reinforcement learning; Deterministic policy gradient (DPG); Sliding mode control (SMC); NONLINEAR-SYSTEMS; ADAPTIVE-CONTROL; PID CONTROL; DESIGN;

D O I：

10.1016/j.apor.2021.102960

中图分类号：

P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

For autonomous underwater vehicles (AUVs), control over AUV heading is of key importance to enable highperformance locomotion control. In this study, the heading control is achieved by using the robust sliding mode control (SMC) method. The performance of the controller can be seriously affected by its parameters. However, it is time-consuming and labor-intensive to manually adjust the parameters. Most of the existing methods rely on the accurate AUV model or prior knowledge, which are difficult to obtain. Therefore, this study is concerned with the problem of automatically tuning the SMC parameters through reinforcement learning (RL). First, an AUV dynamic model with and without current influence was successfully established. Second, a continuous hybrid Model-based Model-free (MbMf) RL method based on the deterministic policy gradient was introduced and explained. Then, the framework for tuning the parameters of SMC by the RL method was described. Finally, to demonstrate the robustness and effectiveness of our approach, extensive numerical simulations were conducted on the established AUV model. The results show that our method can automatically tune the SMC parameters. The performance is more effective than SMC with fixed parameters or SMC with a purely model-free learner.

引用

页数：14

共 50 条

[41] Ventral Striatum and Orbitofrontal Cortex Are Both Required for Model-Based, But Not Model-Free, Reinforcement Learning
McDannald, Michael A.
Lucantonio, Federica
Burke, Kathryn A.
Niv, Yael
Schoenbaum, Geoffrey
[J]. JOURNAL OF NEUROSCIENCE, 2011, 31 (07): : 2700 - 2705
[42] Observer-Based Model-Free Adaptive Sliding Mode Predictive Control
Ren, Bing
Bao, Guangqing
[J]. IEEE ACCESS, 2023, 11 : 59357 - 59367
[43] Model-Based and Model-Free Mechanisms of Human Motor Learning
Haith, Adrian M.
Krakauer, John W.
[J]. PROGRESS IN MOTOR CONTROL: NEURAL, COMPUTATIONAL AND DYNAMIC APPROACHES, 2013, 782 : 1 - 21
[44] Model-based learning retrospectively updates model-free values
Max Doody
Maaike M. H. Van Swieten
Sanjay G. Manohar
[J]. Scientific Reports, 12
[45] Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning
Hafez, Muhammad Burhan
Weber, Cornelius
Kerzel, Matthias
Wermter, Stefan
[J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[46] Model-based learning retrospectively updates model-free values
Doody, Max
Van Swieten, Maaike M. H.
Manohar, Sanjay G.
[J]. SCIENTIFIC REPORTS, 2022, 12 (01)
[47] Model-free voltage control of active distribution system with PVs using surrogate model-based deep reinforcement learning
Cao, Di
Zhao, Junbo
Hu, Weihao
Ding, Fei
Yu, Nanpeng
Huang, Qi
Chen, Zhe
[J]. APPLIED ENERGY, 2022, 306
[48] Model-Free Preference-Based Reinforcement Learning
Wirth, Christian
Fuernkranz, Johannes
Neumann, Gerhard
[J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2222 - 2228
[49] Conflict and competition between model-based and model-free control
Lei, Yuqing
Solway, Alec
[J]. PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (05)
[50] Delay-aware model-based reinforcement learning for continuous control
Chen, Baiming
Xu, Mengdi
Li, Liang
Zhao, Ding
[J]. NEUROCOMPUTING, 2021, 450 : 119 - 128

← 1 2 3 4 5 →