A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems

被引:0
|
作者
Lei Zheng
Siu-Yeung Cho
机构
[1] Nanyang Technological University,School of Computer Engineering
来源
Neural Processing Letters | 2011年 / 33卷
关键词
Memory-based reinforcement learning; Markov decision processes; Partially observable Markov decision processes; Reinforcement learning;
D O I
暂无
中图分类号
学科分类号
摘要
Partially observable Markov decision processes (POMDP) provide a mathematical framework for agent planning under stochastic and partially observable environments. The classic Bayesian optimal solution can be obtained by transforming the problem into Markov decision process (MDP) using belief states. However, because the belief state space is continuous and multi-dimensional, the problem is highly intractable. Many practical heuristic based methods are proposed, but most of them require a complete POMDP model of the environment, which is not always practical. This article introduces a modified memory-based reinforcement learning algorithm called modified U-Tree that is capable of learning from raw sensor experiences with minimum prior knowledge. This article describes an enhancement of the original U-Tree’s state generation process to make the generated model more compact, and also proposes a modification of the statistical test for reward estimation, which allows the algorithm to be benchmarked against some traditional model-based algorithms with a set of well known POMDP problems.
引用
下载
收藏
页码:187 / 200
页数:13
相关论文
共 50 条
  • [1] A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems
    Zheng, Lei
    Cho, Siu-Yeung
    NEURAL PROCESSING LETTERS, 2011, 33 (02) : 187 - 200
  • [2] Sequence Q-Learning: a Memory-based Method Towards Solving POMDP
    Zuters, Janis
    2015 20TH INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN AUTOMATION AND ROBOTICS (MMAR), 2015, : 495 - 500
  • [3] Memory-Based Explainable Reinforcement Learning
    Cruz, Francisco
    Dazeley, Richard
    Vamplew, Peter
    AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 66 - 77
  • [4] Reinforcement Learning Using a Stochastic Gradient Method with Memory-Based Learning
    Yamada, Takafumi
    Yamaguchi, Satoshi
    ELECTRICAL ENGINEERING IN JAPAN, 2010, 173 (01) : 32 - 40
  • [5] Hierarchical memory-based reinforcement learning
    Hernandez-Gardiol, N
    Mahadevan, S
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13, 2001, 13 : 1047 - 1053
  • [6] Memory-based Deep Reinforcement Learning for POMDPs
    Meng, Lingheng
    Gorbet, Rob
    Kulic, Dana
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 5619 - 5626
  • [7] Study on LSTM and ConvLSTM Memory-Based Deep Reinforcement Learning
    Duarte, Fernando Fradique
    Lau, Nuno
    Pereira, Artur
    Reis, Luis Paulo
    AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2023, 2024, 14546 : 223 - 243
  • [8] Deep reinforcement learning method for POMDP based tram signal priority
    Tang, Qianxue
    Zhang, Lin
    Li, Dong
    Ouyang, Zibo
    Zheng, Wei
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 229 - 234
  • [9] Memory-based reinforcement learning algorithm for autonomous exploration in unknown environment
    Dooraki, Amir Ramezani
    Lee, Deok Jin
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2018, 15 (03):
  • [10] A memory-based reinforcement learning model utilizing macro-actions
    Murata, M
    Ozawa, S
    ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, 2005, : 78 - 81