Delay-aware model-based reinforcement learning for continuous control

被引:26
|
作者
Chen, Baiming [1 ]
Xu, Mengdi [2 ]
Li, Liang [1 ]
Zhao, Ding [2 ]
机构
[1] Tsinghua Univ, Beijing 100084, Peoples R China
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
Model-based reinforcement learning; Markov decision process; Continuous control; Delayed system; FINITE SPECTRUM ASSIGNMENT; DEEP NEURAL-NETWORKS; SMITH PREDICTOR; SYSTEMS; INTEGRATOR; STABILITY; ROBOT;
D O I
10.1016/j.neucom.2021.04.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition of delay-aware Markov Decision Process and proves it can be transformed into standard MDP with augmented states using the Markov reward process. We develop a delay-aware model-based reinforcement learning framework that can incorporate the multi-step delay into the learned system models without learning effort. Experiments with the Gym and MuJoCo platforms show that the proposed delay-aware model-based algorithm is more efficient in training and transferable between systems with various durations of delay compared with state-of-the-art model-free reinforce-ment learning methods. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:119 / 128
页数:10
相关论文
共 50 条
  • [21] A Safety Aware Model-Based Reinforcement Learning Framework for Systems with Uncertainties
    Mahmud, S. M. Nahid
    Hareland, Katrine
    Nivison, Scott A.
    Bell, Zachary, I
    Kamalapurkar, Rushikesh
    [J]. 2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1979 - 1984
  • [22] Efficient reinforcement learning: Model-based acrobot control
    Boone, G
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION - PROCEEDINGS, VOLS 1-4, 1997, : 229 - 234
  • [23] Multiple model-based reinforcement learning for nonlinear control
    Samejima, K
    Katagiri, K
    Doya, K
    Kawato, M
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2006, 89 (09): : 54 - 69
  • [24] Offline Model-Based Reinforcement Learning for Tokamak Control
    Char, Ian
    Abbate, Joseph
    Bardoczi, Laszlo
    Boyer, Mark D.
    Chung, Youngseog
    Conlin, Rory
    Erickson, Keith
    Mehta, Viraj
    Richner, Nathan
    Kolemen, Egemen
    Schneider, Jeff
    [J]. LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [25] Delay-aware Transmission Range Control for VANETs
    Li, Jialiang
    Chigan, Chunxiao
    [J]. 2010 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE GLOBECOM 2010, 2010,
  • [26] Learning to Shape by Grinding: Cutting-Surface-Aware Model-Based Reinforcement Learning
    Hachimine, Takumi
    Morimoto, Jun
    Matsubara, Takamitsu
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10): : 6235 - 6242
  • [27] Delay-Aware Period Assignment in Control Systems
    Bini, Enrico
    Cervin, Anton
    [J]. RTSS: 2008 REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, 2008, : 291 - +
  • [28] Meta-DAMS: Delay-Aware Multipath Scheduler using Hybrid Meta Reinforcement Learning
    Sepahi, Amir
    Cai, Lin
    Yang, Wenjun
    Pan, Jianping
    [J]. 2023 IEEE 98TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-FALL, 2023,
  • [29] Delay-Aware Routing in Software-Defined Networks via Network Tomography and Reinforcement Learning
    Tao, Xu
    Monaco, Doriana
    Sacco, Alessio
    Silvestri, Simone
    Marchetto, Guido
    [J]. IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2024, 11 (04): : 3383 - 3397
  • [30] Delay-aware reliable broadcast scheme based on power control for VANETs
    GUO Wei-jie
    HUANG Liu-sheng
    SUN Quan
    XU Hong-li
    ZHANG Hao-ran
    [J]. The Journal of China Universities of Posts and Telecommunications, 2014, (01) : 26 - 35