Stability Analysis for Autonomous Vehicle Navigation Trained over Deep Deterministic Policy Gradient

被引：1

作者：

Cabezas-Olivenza, Mireya ^{[1
]}

Zulueta, Ekaitz ^{[1
]}

Sanchez-Chica, Ander ^{[1
]}

Fernandez-Gamiz, Unai ^{[2
]}

Teso-Fz-Betono, Adrian ^{[1
]}

机构：

[1] Univ Basque Country UPV EHU, Syst Engn & Automat Control Dept, Nieves Cano 12, Vitoria 01006, Spain

[2] Univ Basque Country UPV EHU, Dept Nucl & Fluid Mech, Nieves Cano 12, Vitoria 01006, Spain

来源：

MATHEMATICS | 2023年 / 11卷 / 01期

关键词：

navigation; neural network; autonomous vehicle; reinforcement learning; DDPG; lyapunov; stability; q-learning; DYNAMIC WINDOW APPROACH; ROBOT;

D O I：

10.3390/math11010132

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

The Deep Deterministic Policy Gradient (DDPG) algorithm is a reinforcement learning algorithm that combines Q-learning with a policy. Nevertheless, this algorithm generates failures that are not well understood. Rather than looking for those errors, this study presents a way to evaluate the suitability of the results obtained. Using the purpose of autonomous vehicle navigation, the DDPG algorithm is applied, obtaining an agent capable of generating trajectories. This agent is evaluated in terms of stability through the Lyapunov function, verifying if the proposed navigation objectives are achieved. The reward function of the DDPG is used because it is unknown if the neural networks of the actor and the critic are correctly trained. Two agents are obtained, and a comparison is performed between them in terms of stability, demonstrating that the Lyapunov function can be used as an evaluation method for agents obtained by the DDPG algorithm. Verifying the stability at a fixed future horizon, it is possible to determine whether the obtained agent is valid and can be used as a vehicle controller, so a task-satisfaction assessment can be performed. Furthermore, the proposed analysis is an indication of which parts of the navigation area are insufficient in training terms.

引用

页数：27

共 50 条

[31] Friend-or-Foe Deep Deterministic Policy Gradient
Jiang, Hao
Shi, Dianxi
Xue, Chao
Wang, Yajie
Wang, Gongju
Zhang, Yongjun
2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3523 - 3530
[32] Deep Deterministic Policy Gradient for Nested Parallel Negotiation
Arakawa, Ryota
Fujita, Katsuhide
2023 IEEE INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WI-IAT, 2023, : 197 - 204
[33] Deep Deterministic Policy Gradient With Classified Experience Replay
Shi S.-M.
Liu Q.
Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (07): : 1816 - 1823
[34] Deep Deterministic Policy Gradient with Clustered Prioritized Sampling
Wu, Wen
Zhu, Fei
Fu, YuChen
Liu, Quan
NEURAL INFORMATION PROCESSING (ICONIP 2018), PT II, 2018, 11302 : 645 - 654
[35] Deep Deterministic Policy Gradient With Compatible Critic Network
Wang, Di
Hu, Mengqi
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 4332 - 4344
[36] Deep deterministic policy gradient algorithm: A systematic review
Sumiea, Ebrahim Hamid
Abdulkadir, Said Jadid
Alhussian, Hitham Seddig
Al-Selwi, Safwan Mahmood
Alqushaibi, Alawi
Ragab, Mohammed Gamal
Fati, Suliman Mohamed
HELIYON, 2024, 10 (09)
[37] Deep deterministic policy gradient algorithm for UAV control
Huang X.
Liu J.
Jia C.
Wang Z.
Zhang J.
Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2021, 42 (11):
[38] Developing Flight Control Policy Using Deep Deterministic Policy Gradient
Tsourdos, Antonios
Permana, Adhi Dharma
Budiarti, Dewi H.
Shin, Hyo-Sang
Lee, Chang-Hun
2019 IEEE INTERNATIONAL CONFERENCE ON AEROSPACE ELECTRONICS AND REMOTE SENSING TECHNOLOGY (ICARES 2019), 2019,
[39] Unmanned Aerial Vehicle Trajectory Planning and Power Control Algorithm Based on Deep Deterministic Policy Gradient
Yang Q.
Chen J.
Peng Y.
Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2023, 46 (03): : 43 - 48
[40] Visibility Analysis for Autonomous Vehicle Comfortable Navigation
Morales, Yoichi
Even, Jani
Kallakuri, Nagasrikanth
Ikeda, Tetsushi
Shinozawa, Kazuhiko
Kondo, Tadahisa
Hagita, Norihiro
2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 2197 - 2202

← 1 2 3 4 5 →