A delay-robust method for enhanced real-time reinforcement learning

被引:0
|
作者
机构
[1] Xia, Bo
[2] Sun, Haoyuan
[3] Yuan, Bo
[4] Li, Zhiheng
[5] Liang, Bin
[6] Wang, Xueqian
关键词
Markov processes;
D O I
10.1016/j.neunet.2024.106769
中图分类号
学科分类号
摘要
In reinforcement learning, the Markov Decision Process (MDP) framework typically operates under a blocking paradigm, assuming a static environment during the agent's decision-making and stationary agent behavior while the environment executes its actions. This static model often proves inadequate for real-time tasks, as it lacks the flexibility to handle concurrent changes in both the agent's decision-making process and the environment's dynamic responses. Contemporary solutions, such as linear interpolation or state space augmentation, attempt to address the asynchronous nature of delayed states and actions in real-time environments. However, these methods frequently require precise delay measurements and may fail to fully capture the complexities of delay dynamics. However, these methods frequently require precise delay measurements and may fail to fully capture the complexities of delay dynamics. To address these challenges, we introduce a minimal information set that encapsulates concurrent information during agent-environment interactions, serving as the foundation of our real-time decision-making framework. The traditional blocking-mode MDP is then reformulated as a Minimal Information State Markov Decision Process (MISMDP), aligning more closely with the demands of real-time environments. Within this MISMDP framework, we propose the Minimal information set for Real-time tasks using Actor-Critic (MRAC), a general approach for addressing delay issues in real-time tasks, supported by a rigorous theoretical analysis of Q-function convergence. Extensive experiments across both discrete and continuous action space environments demonstrate that MRAC outperforms state-of-the-art algorithms, delivering superior performance and generalization in managing delays within real-time tasks. © 2024
引用
收藏
相关论文
共 50 条
  • [1] Evolving population method for real-time reinforcement learning
    Kim, Man-Je
    Kim, Jun Suk
    Ahn, Chang Wook
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 229
  • [2] Real-Time Reinforcement Learning
    Ramstedt, Simon
    Pal, Christopher
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [3] Applying Reinforcement Learning Method for Real-time Energy Management
    Dayani, Aida Borhan
    Fazlollahtabar, Hamed
    Ahmadiahangar, Roya
    Rosin, Argo
    Naderi, Mohammad Salay
    Bagheri, Mehdi
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ENVIRONMENT AND ELECTRICAL ENGINEERING AND 2019 IEEE INDUSTRIAL AND COMMERCIAL POWER SYSTEMS EUROPE (EEEIC / I&CPS EUROPE), 2019,
  • [4] Control Delay in Reinforcement Learning for Real-Time Dynamic Systems: A Memoryless Approach
    Schuitema, Erik
    Busoniu, Lucian
    Babuska, Robert
    Jonker, Pieter
    [J]. IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010), 2010, : 3226 - 3231
  • [5] Benchmarking Real-Time Reinforcement Learning
    Thodoroff, Pierre
    Li, Wenyu
    Lawrence, Neil D.
    [J]. NEURIPS 2021 WORKSHOP ON PRE-REGISTRATION IN MACHINE LEARNING, VOL 181, 2021, 181 : 26 - 41
  • [6] English synchronous real-time translation method based on reinforcement learning
    Ke, Xin
    [J]. WIRELESS NETWORKS, 2024, 30 (05) : 4167 - 4179
  • [7] Reinforcement Learning Method for Ad Networks Ordering in Real-Time Bidding
    Afshar, Reza Refaei
    Zhang, Yingqian
    Firat, Murat
    Kaymak, Uzay
    [J]. AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2019, 2019, 11978 : 16 - 36
  • [8] Real-Time Holding Control for Transfer Synchronization via Robust Multiagent Reinforcement Learning
    Yu, Xinlian
    Khani, Alireza
    Chen, Jingxu
    Xu, Hongli
    Mao, Haijun
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 23993 - 24007
  • [9] Real-Time IDS Using Reinforcement Learning
    Sagha, Hesam
    Shouraki, Saeed Bagheri
    Khasteh, Hosein
    Dehghani, Mahdi
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL II, PROCEEDINGS, 2008, : 593 - +
  • [10] Reinforcement learning in real-time geometry assurance
    Jorge, Emilio
    Brynte, Lucas
    Cronrath, Constantin
    Wigstrom, Oskar
    Bengtsson, Kristofer
    Gustaysson, Emil
    Lennartson, Bengt
    Jirstrand, Mats
    [J]. 51ST CIRP CONFERENCE ON MANUFACTURING SYSTEMS, 2018, 72 : 1073 - 1078