Autoregressive Policies for Continuous Control Deep Reinforcement Learning

被引:0
|
作者
Korenkevych, Dmytro [1 ]
Mahmood, A. Rupam [1 ]
Vasan, Gautham [1 ]
Bergstra, James [1 ]
机构
[1] Kindred AI, San Francisco, CA 94107 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning algorithms rely on exploration to discover new behaviors, which is typically achieved by following a stochastic policy. In continuous control tasks, policies with a Gaussian distribution have been widely adopted. Gaussian exploration however does not result in smooth trajectories that generally correspond to safe and rewarding behaviors in practical tasks. In addition, Gaussian policies do not result in an effective exploration of an environment and become increasingly inefficient as the action rate increases. This contributes to a low sample efficiency often observed in learning continuous control tasks. We introduce a family of stationary autoregressive (AR) stochastic processes to facilitate exploration in continuous control domains. We show that proposed processes possess two desirable features: subsequent process observations are temporally coherent with continuously adjustable degree of coherence, and the process stationary distribution is standard normal. We derive an autoregressive policy (ARP) that implements such processes maintaining the standard agent-environment interface. We show how ARPs can be easily used with the existing off-the-shelf learning algorithms. Empirically we demonstrate that using ARPs results in improved exploration and sample efficiency in both simulated and real world domains, and, furthermore, provides smooth exploration trajectories that enable safe operation of robotic hardware.
引用
收藏
页码:2754 / 2762
页数:9
相关论文
共 50 条
  • [1] DEEP REINFORCEMENT LEARNING FOR TRANSFER OF CONTROL POLICIES
    Cunningham, James D.
    Miller, Simon W.
    Yukish, Michael A.
    Simpson, Timothy W.
    Tucker, Conrad S.
    [J]. PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2019, VOL 2A, 2020,
  • [2] Benchmarking Deep Reinforcement Learning for Continuous Control
    Duan, Yan
    Chen, Xi
    Houthooft, Rein
    Schulman, John
    Abbeel, Pieter
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [3] Continuous control of a polymerization system with deep reinforcement learning
    Ma, Yan
    Zhu, Wenbo
    Benton, Michael G.
    Romagnoli, Jose
    [J]. JOURNAL OF PROCESS CONTROL, 2019, 75 : 40 - 47
  • [4] Hierarchical Deep Reinforcement Learning for Continuous Action Control
    Yang, Zhaoyang
    Merrick, Kathryn
    Jin, Lianwen
    Abbass, Hussein A.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (11) : 5174 - 5184
  • [5] Continuous Control in Car Simulator with Deep Reinforcement Learning
    Yang, Fan
    Wang, Ping
    Wang, XinHong
    [J]. PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 566 - 570
  • [6] Deep Reinforcement Learning for Continuous Control of Material Thickness
    Dippel, Oliver
    Lisitsa, Alexei
    Peng, Bei
    [J]. ARTIFICIAL INTELLIGENCE XL, AI 2023, 2023, 14381 : 321 - 334
  • [7] Continuous Control of an Underground Loader Using Deep Reinforcement Learning
    Backman, Sofi
    Lindmark, Daniel
    Bodin, Kenneth
    Servin, Martin
    Mork, Joakim
    Lofgren, Hakan
    [J]. MACHINES, 2021, 9 (10)
  • [8] Continuous Control with Deep Reinforcement Learning for Mobile Robot Navigation
    Xiang, Jiaqi
    Li, Qingdong
    Dong, Xiwang
    Ren, Zhang
    [J]. 2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 1501 - 1506
  • [9] Verified Probabilistic Policies for Deep Reinforcement Learning
    Bacci, Edoardo
    Parker, David
    [J]. NASA FORMAL METHODS (NFM 2022), 2022, 13260 : 193 - 212
  • [10] Deep reinforcement learning for continuous wood drying production line control
    Tremblay, Francois-Alexandre
    Durand, Audrey
    Morin, Michael
    Marier, Philippe
    Gaudreault, Jonathan
    [J]. COMPUTERS IN INDUSTRY, 2024, 154