A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems

被引：0

作者：

Lei Zheng

Siu-Yeung Cho

机构：

[1] Nanyang Technological University,School of Computer Engineering

来源：

Neural Processing Letters | 2011年 / 33卷

关键词：

Memory-based reinforcement learning; Markov decision processes; Partially observable Markov decision processes; Reinforcement learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Partially observable Markov decision processes (POMDP) provide a mathematical framework for agent planning under stochastic and partially observable environments. The classic Bayesian optimal solution can be obtained by transforming the problem into Markov decision process (MDP) using belief states. However, because the belief state space is continuous and multi-dimensional, the problem is highly intractable. Many practical heuristic based methods are proposed, but most of them require a complete POMDP model of the environment, which is not always practical. This article introduces a modified memory-based reinforcement learning algorithm called modified U-Tree that is capable of learning from raw sensor experiences with minimum prior knowledge. This article describes an enhancement of the original U-Tree’s state generation process to make the generated model more compact, and also proposes a modification of the statistical test for reward estimation, which allows the algorithm to be benchmarked against some traditional model-based algorithms with a set of well known POMDP problems.

引用

下载

页码：187 / 200

页数：13

共 50 条

[1] A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems
Zheng, Lei
Cho, Siu-Yeung
NEURAL PROCESSING LETTERS, 2011, 33 (02) : 187 - 200
[2] Sequence Q-Learning: a Memory-based Method Towards Solving POMDP
Zuters, Janis
2015 20TH INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN AUTOMATION AND ROBOTICS (MMAR), 2015, : 495 - 500
[3] Memory-Based Explainable Reinforcement Learning
Cruz, Francisco
Dazeley, Richard
Vamplew, Peter
AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 66 - 77
[4] Reinforcement Learning Using a Stochastic Gradient Method with Memory-Based Learning
Yamada, Takafumi
Yamaguchi, Satoshi
ELECTRICAL ENGINEERING IN JAPAN, 2010, 173 (01) : 32 - 40
[5] Hierarchical memory-based reinforcement learning
Hernandez-Gardiol, N
Mahadevan, S
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13, 2001, 13 : 1047 - 1053
[6] Memory-based Deep Reinforcement Learning for POMDPs
Meng, Lingheng
Gorbet, Rob
Kulic, Dana
2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 5619 - 5626
[7] Study on LSTM and ConvLSTM Memory-Based Deep Reinforcement Learning
Duarte, Fernando Fradique
Lau, Nuno
Pereira, Artur
Reis, Luis Paulo
AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2023, 2024, 14546 : 223 - 243
[8] Deep reinforcement learning method for POMDP based tram signal priority
Tang, Qianxue
Zhang, Lin
Li, Dong
Ouyang, Zibo
Zheng, Wei
2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 229 - 234
[9] Memory-based reinforcement learning algorithm for autonomous exploration in unknown environment
Dooraki, Amir Ramezani
Lee, Deok Jin
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2018, 15 (03):
[10] A memory-based reinforcement learning model utilizing macro-actions
Murata, M
Ozawa, S
ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, 2005, : 78 - 81

← 1 2 3 4 5 →