Stability Analysis for Autonomous Vehicle Navigation Trained over Deep Deterministic Policy Gradient

被引：1

作者：

Cabezas-Olivenza, Mireya ^{[1
]}

Zulueta, Ekaitz ^{[1
]}

Sanchez-Chica, Ander ^{[1
]}

Fernandez-Gamiz, Unai ^{[2
]}

Teso-Fz-Betono, Adrian ^{[1
]}

机构：

[1] Univ Basque Country UPV EHU, Syst Engn & Automat Control Dept, Nieves Cano 12, Vitoria 01006, Spain

[2] Univ Basque Country UPV EHU, Dept Nucl & Fluid Mech, Nieves Cano 12, Vitoria 01006, Spain

来源：

MATHEMATICS | 2023年 / 11卷 / 01期

关键词：

navigation; neural network; autonomous vehicle; reinforcement learning; DDPG; lyapunov; stability; q-learning; DYNAMIC WINDOW APPROACH; ROBOT;

D O I：

10.3390/math11010132

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

The Deep Deterministic Policy Gradient (DDPG) algorithm is a reinforcement learning algorithm that combines Q-learning with a policy. Nevertheless, this algorithm generates failures that are not well understood. Rather than looking for those errors, this study presents a way to evaluate the suitability of the results obtained. Using the purpose of autonomous vehicle navigation, the DDPG algorithm is applied, obtaining an agent capable of generating trajectories. This agent is evaluated in terms of stability through the Lyapunov function, verifying if the proposed navigation objectives are achieved. The reward function of the DDPG is used because it is unknown if the neural networks of the actor and the critic are correctly trained. Two agents are obtained, and a comparison is performed between them in terms of stability, demonstrating that the Lyapunov function can be used as an evaluation method for agents obtained by the DDPG algorithm. Verifying the stability at a fixed future horizon, it is possible to determine whether the obtained agent is valid and can be used as a vehicle controller, so a task-satisfaction assessment can be performed. Furthermore, the proposed analysis is an indication of which parts of the navigation area are insufficient in training terms.

引用

下载

页数：27

共 50 条

[41] Adaptive Navigation Algorithm with Deep Learning for Autonomous Underwater Vehicle
Ma, Hui
Mu, Xiaokai
He, Bo
SENSORS, 2021, 21 (19)
[42] State Representation Learning for Minimax Deep Deterministic Policy Gradient
Hu, Dapeng
Jiang, Xuesong
Wei, Xiumei
Wang, Jian
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I, 2019, 11775 : 481 - 487
[43] Deep Deterministic Policy Gradient With Prioritized Sampling for Power Control
Zhou, Shiyang
Cheng, Yufan
Lei, Xia
Duan, Huanhuan
IEEE ACCESS, 2020, 8 : 194240 - 194250
[44] A Method of Attitude Control Based on Deep Deterministic Policy Gradient
Zhang, Jian
Wu, Fengge
Zhao, Junsuo
Xu, Fanjiang
COGNITIVE SYSTEMS AND SIGNAL PROCESSING, PT II, 2019, 1006 : 197 - 207
[45] NETWORK ARCHITECTURE REASONING VIA DEEP DETERMINISTIC POLICY GRADIENT
Liu, Huidong
Du, Fang
Tang, Xiaofen
Liu, Hao
Yu, Zhenhua
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
[46] Controlling Bicycle Using Deep Deterministic Policy Gradient Algorithm
Le Pham Tuyen
Chung, TaeChoong
2017 14TH INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS AND AMBIENT INTELLIGENCE (URAI), 2017, : 413 - 417
[47] Semicentralized Deep Deterministic Policy Gradient in Cooperative StarCraft Games
Xie, Dong
Zhong, Xiangnan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1584 - 1593
[48] Network Architecture for Optimizing Deep Deterministic Policy Gradient Algorithms
Zhang, Haifei
Xu, Jian
Zhang, Jian
Liu, Quan
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
[49] Dynamical Motor Control Learned with Deep Deterministic Policy Gradient
Shi, Haibo
Sun, Yaoru
Li, Jie
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2018, 2018
[50] BYZANTINE-ROBUST FEDERATED DEEP DETERMINISTIC POLICY GRADIENT
Lin, Qifeng
Ling, Qing
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4013 - 4017

← 1 2 3 4 5 →