A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems

被引：0

作者：

Lei Zheng

Siu-Yeung Cho

机构：

[1] Nanyang Technological University,School of Computer Engineering

来源：

Neural Processing Letters | 2011年 / 33卷

关键词：

Memory-based reinforcement learning; Markov decision processes; Partially observable Markov decision processes; Reinforcement learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Partially observable Markov decision processes (POMDP) provide a mathematical framework for agent planning under stochastic and partially observable environments. The classic Bayesian optimal solution can be obtained by transforming the problem into Markov decision process (MDP) using belief states. However, because the belief state space is continuous and multi-dimensional, the problem is highly intractable. Many practical heuristic based methods are proposed, but most of them require a complete POMDP model of the environment, which is not always practical. This article introduces a modified memory-based reinforcement learning algorithm called modified U-Tree that is capable of learning from raw sensor experiences with minimum prior knowledge. This article describes an enhancement of the original U-Tree’s state generation process to make the generated model more compact, and also proposes a modification of the statistical test for reward estimation, which allows the algorithm to be benchmarked against some traditional model-based algorithms with a set of well known POMDP problems.

引用

下载

页码：187 / 200

页数：13

共 50 条

[21] Memory-based crowd-aware robot navigation using deep reinforcement learning
Samsani, Sunil Srivatsav
Mutahira, Husna
Muhammad, Mannan Saeed
COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (02) : 2147 - 2158
[22] A tabular approach memory-based learning
Lin, C.-S. (linc@missouri.edu), 1600, Taylor and Francis Inc. (05):
[23] Memory-based learning for visual odometry
Roberts, Richard
Nguyen, Hai
Krishnamurthi, Niyant
Balch, Tucker
2008 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-9, 2008, : 47 - 52
[24] BAYESIAN REINFORCEMENT LEARNING FOR POMDP-BASED DIALOGUE SYSTEMS
Png, ShaoWei
Pineau, Joelle
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2156 - 2159
[25] Multiagent Reinforcement Learning: Rollout and Policy Iteration for POMDP With Application to Multirobot Problems
Bhattacharya, Sushmita
Kailas, Siva
Badyal, Sahil
Gil, Stephanie
Bertsekas, Dimitri
IEEE TRANSACTIONS ON ROBOTICS, 2024, 40 : 2003 - 2023
[26] Estimate of current state based on experience in POMDP for Reinforcement Learning
Miyazaki, Yoshiki
Kurashige, Kentarou
PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 17TH '12), 2012, : 1135 - 1138
[27] Memory-based problem solving and schema induction in Go
Heneveld, A
Bundy, A
Ramscar, M
Richardson, J
PROCEEDINGS OF THE TWENTY-SECOND ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, 2000, : 226 - 231
[28] Implicit memory-based technique in solving dynamic scheduling problems through Response Surface Methodology - Part I Model and method
Abello, Manuel Blanco
Michalewicz, Zbigniew
INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2014, 7 (02) : 111 - 139
[29] Reinforcement Learning for Solving Communication Problems in Shepherding
Mohamed, Reem E.
Elsayed, Saber
Hunjet, Robert
Abbass, Hussein
2022 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2022, : 1626 - 1635
[30] Solving Safety Problems with Ensemble Reinforcement Learning
Ferreira, Leonardo A.
dos Santos, Thiago F.
Bianchi, Reinaldo A. C.
Santos, Paulo E.
AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 203 - 214

← 1 2 3 4 5 →