Stability Analysis for Autonomous Vehicle Navigation Trained over Deep Deterministic Policy Gradient

被引:1
|
作者
Cabezas-Olivenza, Mireya [1 ]
Zulueta, Ekaitz [1 ]
Sanchez-Chica, Ander [1 ]
Fernandez-Gamiz, Unai [2 ]
Teso-Fz-Betono, Adrian [1 ]
机构
[1] Univ Basque Country UPV EHU, Syst Engn & Automat Control Dept, Nieves Cano 12, Vitoria 01006, Spain
[2] Univ Basque Country UPV EHU, Dept Nucl & Fluid Mech, Nieves Cano 12, Vitoria 01006, Spain
关键词
navigation; neural network; autonomous vehicle; reinforcement learning; DDPG; lyapunov; stability; q-learning; DYNAMIC WINDOW APPROACH; ROBOT;
D O I
10.3390/math11010132
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The Deep Deterministic Policy Gradient (DDPG) algorithm is a reinforcement learning algorithm that combines Q-learning with a policy. Nevertheless, this algorithm generates failures that are not well understood. Rather than looking for those errors, this study presents a way to evaluate the suitability of the results obtained. Using the purpose of autonomous vehicle navigation, the DDPG algorithm is applied, obtaining an agent capable of generating trajectories. This agent is evaluated in terms of stability through the Lyapunov function, verifying if the proposed navigation objectives are achieved. The reward function of the DDPG is used because it is unknown if the neural networks of the actor and the critic are correctly trained. Two agents are obtained, and a comparison is performed between them in terms of stability, demonstrating that the Lyapunov function can be used as an evaluation method for agents obtained by the DDPG algorithm. Verifying the stability at a fixed future horizon, it is possible to determine whether the obtained agent is valid and can be used as a vehicle controller, so a task-satisfaction assessment can be performed. Furthermore, the proposed analysis is an indication of which parts of the navigation area are insufficient in training terms.
引用
下载
收藏
页数:27
相关论文
共 50 条
  • [41] Adaptive Navigation Algorithm with Deep Learning for Autonomous Underwater Vehicle
    Ma, Hui
    Mu, Xiaokai
    He, Bo
    SENSORS, 2021, 21 (19)
  • [42] State Representation Learning for Minimax Deep Deterministic Policy Gradient
    Hu, Dapeng
    Jiang, Xuesong
    Wei, Xiumei
    Wang, Jian
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I, 2019, 11775 : 481 - 487
  • [43] Deep Deterministic Policy Gradient With Prioritized Sampling for Power Control
    Zhou, Shiyang
    Cheng, Yufan
    Lei, Xia
    Duan, Huanhuan
    IEEE ACCESS, 2020, 8 : 194240 - 194250
  • [44] A Method of Attitude Control Based on Deep Deterministic Policy Gradient
    Zhang, Jian
    Wu, Fengge
    Zhao, Junsuo
    Xu, Fanjiang
    COGNITIVE SYSTEMS AND SIGNAL PROCESSING, PT II, 2019, 1006 : 197 - 207
  • [45] NETWORK ARCHITECTURE REASONING VIA DEEP DETERMINISTIC POLICY GRADIENT
    Liu, Huidong
    Du, Fang
    Tang, Xiaofen
    Liu, Hao
    Yu, Zhenhua
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [46] Controlling Bicycle Using Deep Deterministic Policy Gradient Algorithm
    Le Pham Tuyen
    Chung, TaeChoong
    2017 14TH INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS AND AMBIENT INTELLIGENCE (URAI), 2017, : 413 - 417
  • [47] Semicentralized Deep Deterministic Policy Gradient in Cooperative StarCraft Games
    Xie, Dong
    Zhong, Xiangnan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1584 - 1593
  • [48] Network Architecture for Optimizing Deep Deterministic Policy Gradient Algorithms
    Zhang, Haifei
    Xu, Jian
    Zhang, Jian
    Liu, Quan
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [49] Dynamical Motor Control Learned with Deep Deterministic Policy Gradient
    Shi, Haibo
    Sun, Yaoru
    Li, Jie
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2018, 2018
  • [50] BYZANTINE-ROBUST FEDERATED DEEP DETERMINISTIC POLICY GRADIENT
    Lin, Qifeng
    Ling, Qing
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4013 - 4017