Linear Reinforcement Learning with Ball Structure Action Space

被引：0

作者：

Jia, Zeyu ^{[1
,2
]}

Jia, Randy ^{[2
]}

Madeka, Dhruv ^{[2
]}

Foster, Dean P. ^{[2
]}

机构：

[1] MIT, Cambridge, MA 02139 USA

[2] Amazon, Seattle, WA USA

来源：

INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201 | 2023年 / 201卷

关键词：

Markov Decision Process; Reinforcement Learning;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We study the problem of Reinforcement Learning (RL) with linear function approximation, i.e. assuming the optimal action-value function is linear in a known d-dimensional feature mapping. Unfortunately, however, based on only this assumption, the worst case sample complexity has been shown to be exponential, even under a generative model. Instead of making further assumptions on the MDP or value functions, we assume that our action space is such that there always exist playable actions to explore any direction of the feature space. We formalize this assumption as a "ball structure" action space, and show that being able to freely explore the feature space allows for efficient RL. In particular, we propose a sample-efficient RL algorithm (BallRL) that learns an epsilon-optimal policy using only (O) over tilde (H(5)d(3)/epsilon(3)) number of trajectories.

引用

页码：755 / 775

页数：21

共 50 条

[1] Switching reinforcement learning for continuous action space
Nagayoshi, Masato
Murao, Hajime
Tamaki, Hisashi
[J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN, 2012, 95 (03) : 37 - 44
[2] Action Space Shaping in Deep Reinforcement Learning
Kanervisto, Anssi
Scheller, Christian
Hautamaki, Ville
[J]. 2020 IEEE CONFERENCE ON GAMES (IEEE COG 2020), 2020, : 479 - 486
[3] Reinforcement Learning in Latent Action Sequence Space
Kim, Heecheol
Yamada, Masanori
Miyoshi, Kosuke
Iwata, Tomoharu
Yamakawa, Hiroshi
[J]. 2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5497 - 5503
[4] Couple Particles in Action Space for Reinforcement Learning
Notsu, Akira
Honda, Katsuhiro
Ichihashi, Hidetomo
[J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2010, 10 (12): : 200 - 203
[5] LASER: Learning a Latent Action Space for Efficient Reinforcement Learning
Allshire, Arthur
Martin-Martin, Roberto
Lin, Charles
Manuel, Shawn
Savarese, Silvio
Garg, Animesh
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 6650 - 6656
[6] Hierarchical Advantage for Reinforcement Learning in Parameterized Action Space
Hu, Zhejie
Kaneko, Tomoyuki
[J]. 2021 IEEE CONFERENCE ON GAMES (COG), 2021, : 816 - 823
[7] DEEP REINFORCEMENT LEARNING IN LINEAR DISCRETE ACTION SPACES
van Heeswijk, Wouter
La Poutre, Han
[J]. 2020 WINTER SIMULATION CONFERENCE (WSC), 2020, : 1063 - 1074
[8] A reinforcement learning with switching controllers for a continuous action space
Nagayoshi, Masato
Murao, Hajime
Tamaki, Hisashi
[J]. ARTIFICIAL LIFE AND ROBOTICS, 2010, 15 (01) : 97 - 100
[9] Reinforcement learning algorithm with CTRNN in continuous action space
Arie, Hiroaki
Namikawa, Jun
Ogata, Tetsuya
Tani, Jun
Sugano, Shigeki
[J]. NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2006, 4232 : 387 - 396
[10] Deep Reinforcement Learning with a Natural Language Action Space
He, Ji
Chen, Jianshu
He, Xiaodong
Gao, Jianfeng
Li, Lihong
Deng, Li
Ostendorf, Mari
[J]. PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1621 - 1630

← 1 2 3 4 5 →