Research on Vehicle Control Algorithm Based on Distributed Reinforcement Learning

被引：0

作者：

Liu W. ^{[1
,2
]}

Xiang Z. ^{[1
]}

Liu W. ^{[1
,2
]}

Qi D. ^{[2
]}

Wang Z. ^{[2
]}

机构：

[1] School of Information and Electronic Engineering, Zhejiang University, Hangzhou

[2] National Innovation Center of Intelligent and Connected Vehicles, Beijing

来源：

Qiche Gongcheng/Automotive Engineering | 2023年 / 45卷 / 09期

关键词：

autonomous driving; Carla; distributed system; multi-agent; reinforcement learning; vehicle control;

D O I：

10.19562/j.chinasae.qcgc.2023.09.012

中图分类号：

学科分类号：

摘要：

The development of end-to-end autonomous driving algorithms has become a hot topic in current autonomous driving technology research and development. Classic reinforcement learning algorithms leverage infor⁃ mation such as vehicle state and environmental feedback to train the vehicle for driving,through trial-and-error learning to obtain the best strategy,so as to achieve the development of end-to-end autonomous driving algorithms. However,there is still the problem of low development efficiency. The article proposes an asynchronous distributed reinforcement learning framework to address the inefficiency and high complexity problems in training RL algo⁃ rithms in virtual simulation environment,establishes intra and inter process multi-agent parallel Soft Actor-Critic (SAC)distributed training framework on the Carla simulator to accelerate online RL training. Additionally,to achieve rapid model training and deployment,the article proposes a distributed model training and deployment sys⁃ tem architecture based on Cloud-OTA,which mainly consists of an Over-the-Air Technology(OTA)platform,a cloud-based distributed training platform,and an on-vehicle computing platform. On this basis,the paper establish⁃ es an Autoware-Carla integrated validation framework based on ROS to improve model reusability and reduce migra⁃ tion and deployment cost. The experimental results show that compared with various mainstream autonomous driving methods,the method proposed in this paper has a faster training speed qualitatively,which can effectively cope with dense traffic flow and improve the adaptability of end-to-end autonomous driving strategies to unknown scenes, and reduce the time and resources required for experimentation in actual environment. © 2023 SAE-China. All rights reserved.

引用

页码：1637 / 1645

页数：8

共 17 条

[1] ZHANG X Y, GAO H B, ZHAO J H, Et al., Overview of deep learning intelligent driving methods［J］, Journal of Tsinghua Uni⁃ versity(Science and Technology), 58, 4, pp. 438-444, (2018)
[2] FENG Y, XIA Z L, GUO A, Et al., Survey of testing techniques of autonomous driving software［J］, Journal of Image and Graphics, 26, 1, pp. 13-27, (2021)
[3] XU R C., Research on autonomous driving strategy based on deep reinforcement learning ［D］, (2021)
[4] DOSOVITSKIY A, ROS G, CODEVILLA F, Et al., Carla：an open urban driving simulator［C］.Conference on Robot Learning, pp. 1-16, (2017)
[5] AGARWAL T, ARORA H, SCHNEIDER J., Affordance-based reinforcement learning for urban driving［J］, (2021)
[6] HORGAN D, QUAN J, BUDDEN D, Et al., Distributed priori⁃ tized experience replay［J］, (2018)
[7] ESPEHOLT L, SOYER H, MUNOS R, Et al., Impala：scalable distributed deep-RL with importance weighted actor-learner ar⁃ chitectures［C］, International Conference on Machine Learning, pp. 1407-1416, (2018)
[8] CODEVILLA F, SANTANA E, LOPEZ A M, Et al., Exploring the limitations of behavior cloning for autonomous driving［C］, Pro⁃ ceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9329-9338, (2019)
[9] YANG S, JIANG Y D, WU J, Et al., Autonomous driving policy learning based on deep reinforcement learning and multi-type sensor data［J］, Journal of Jilin University(Engineering and Tech⁃ nology Edition), 49, 4, pp. 1026-1033, (2019)
[10] HAARNOJA T, ZHOU A, ABBEEL P, Et al., Soft actor-critic：off-policy maximum entropy deep reinforcement learning with a stochastic actor［C］, International Conference on Machine Learn⁃ ing. PMLR, pp. 1861-1870, (2018)

← 1 2 →