Policy Iteration Based Approximate Dynamic Programming Toward Autonomous Driving in Constrained Dynamic Environment

被引:16
|
作者
Lin, Ziyu [1 ]
Ma, Jun [2 ,3 ]
Duan, Jingliang [4 ]
Li, Shengbo Eben [1 ]
Ma, Haitong [1 ]
Cheng, Bo [1 ]
Lee, Tong Heng [5 ]
机构
[1] Tsinghua Univ, Sch Vehicle & Mobil, Beijing 100084, Peoples R China
[2] Hong Kong Univ Sci & Technol Guangzhou, Robot & Autonomous Syst Thrust, Guangzhou, Peoples R China
[3] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Peoples R China
[4] Univ Sci & Technol Beijing, Sch Mech Engn, Beijing 100083, Peoples R China
[5] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117583, Singapore
基金
国家重点研发计划;
关键词
Planning; Autonomous vehicles; Vehicle dynamics; Task analysis; Heuristic algorithms; Approximation algorithms; Roads; Autonomous driving; approximate dynamic programming; motion planning; constrained optimization; reinforcement learning; VEHICLE;
D O I
10.1109/TITS.2023.3237568
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
In the area of autonomous driving, it typically brings great difficulty in solving the motion planning problem since the vehicle model is nonlinear and the driving scenarios are complex. Particularly, most of the existing methods cannot be generalized to dynamically changing scenarios with varying surrounding vehicles. To address this problem, this development here investigates the framework of integrated decision and control. As part of the modules, static path planning determines the reference candidates ahead, and then the optimal path-tracking controller realizes the specific autonomous driving task. An innovative and effective constrained finite-horizon approximate dynamic programming (ADP) algorithm is herein presented to generate the desired control policy for effective path tracking. With the generalized policy neural network that maps from the state to the control input, the proposed algorithm preserves the high effectiveness for the motion planning problem towards changing driving environments with varying surrounding vehicles. Moreover, the algorithm attains the noteworthy advantage of alleviating the typically heavy computational loads with the mode of offline training and online execution. As a result of the utilization of multi-layer neural networks in conjunction with the actor-critic framework, the constrained ADP method is capable of handling complex and multidimensional scenarios. Finally, various simulations have been carried out to show that the constrained ADP algorithm is effective.
引用
收藏
页码:5003 / 5013
页数:11
相关论文
共 50 条
  • [1] Empirical Policy Iteration for Approximate Dynamic Programming
    Haskell, William B.
    Jain, Rahul
    Kalathil, Dileep
    2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 6573 - 6578
  • [2] Policy Iteration Approximate Dynamic Programming Using Volterra Series Based Actor
    Guo, Wentao
    Si, Jennie
    Liu, Feng
    Mei, Shengwei
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 249 - 255
  • [3] Approximate policy iteration for dynamic resource-constrained project scheduling
    Parizi, Mahshid Salemi
    Gocgun, Yasin
    Ghate, Archis
    OPERATIONS RESEARCH LETTERS, 2017, 45 (05) : 442 - 447
  • [4] Empirical Value Iteration for Approximate Dynamic Programming
    Haskell, William B.
    Jain, Rahul
    Kalathil, Dileep
    2014 AMERICAN CONTROL CONFERENCE (ACC), 2014, : 495 - 500
  • [5] Policy Approximation in Policy Iteration Approximate Dynamic Programming for Discrete-Time Nonlinear Systems
    Guo, Wentao
    Si, Jennie
    Liu, Feng
    Mei, Shengwei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) : 2794 - 2807
  • [6] REDUCED COMPLEXITY DYNAMIC-PROGRAMMING BASED ON POLICY ITERATION
    BAYARD, DS
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1992, 170 (01) : 75 - 103
  • [7] Policy iteration-approximate dynamic programming for large scale unit commitment problems
    Wei, Hua
    Long, Danli
    Li, Jinghua
    Zhongguo Dianji Gongcheng Xuebao/Proceedings of the Chinese Society of Electrical Engineering, 2014, 34 (25): : 4420 - 4429
  • [8] Research on Autonomous Maneuvering Decision of UCAV Based on Approximate Dynamic Programming
    Hu, Zhencai
    Gao, Peng
    Wang, Fei
    2019 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2019, 11321
  • [9] An approximate dynamic programming approach for communication constrained inference
    Williams, J. L.
    Fisher, J. W., III
    Willsky, A. S.
    2005 IEEE/SP 13TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING (SSP), VOLS 1 AND 2, 2005, : 1129 - 1134
  • [10] Adaptive feedback control by constrained approximate dynamic programming
    Ferrari, Silvia
    Steck, James E.
    Chandramohan, Rajeev
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04): : 982 - 987