Multi-task Reinforcement Learning in Partially Observable Stochastic Environments

被引：0

作者：

Li, Hui ^{[1
]}

Liao, Xuejun ^{[1
]}

Carin, Lawrence ^{[1
]}

机构：

[1] Duke Univ, Dept Elect & Comp Engn, Durham, NC 27708 USA

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2009年 / 10卷

关键词：

reinforcement learning; partially observable Markov decision processes; multi-task learning; Dirichlet processes; regionalized policy representation; HIDDEN MARKOV-MODELS; INFINITE-HORIZON; DISTRIBUTIONS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider the problem of multi-task reinforcement learning (MTRL) in multiple partially observable stochastic environments. We introduce the regionalized policy representation (RPR) to characterize the agent's behavior in each environment. The RPR is a parametric model of the conditional distribution over current actions given the history of past actions and observations; the agent's choice of actions is directly based on this conditional distribution, without an intervening model to characterize the environment itself. We propose off-policy batch algorithms to learn the parameters of the RPRs, using episodic data collected when following a behavior policy, and show their linkage to policy iteration. We employ the Dirichlet process as a nonparametric prior over the RPRs across multiple environments. The intrinsic clustering property of the Dirichlet process imposes sharing of episodes among similar environments, which effectively reduces the number of episodes required for learning a good policy in each environment, when data sharing is appropriate. The number of distinct RPRs and the associated clusters (the sharing patterns) are automatically discovered by exploiting the episodic data as well as the nonparametric nature of the Dirichlet process. We demonstrate the effectiveness of the proposed RPR as well as the RPR-based MTRL framework on various problems, including grid-world navigation and multi-aspect target classification. The experimental results show that the RPR is a competitive reinforcement learning algorithm in partially observable domains, and the MTRL consistently achieves better performance than single task reinforcement learning.

引用

页码：1131 / 1186

页数：56

共 50 条

[41] Contrastive Modules with Temporal Attention for Multi-Task Reinforcement Learning
Lan, Siming
Zhang, Rui
Yi, Qi
Guo, Jiaming
Peng, Shaohui
Gao, Yunkai
Wu, Fan
Chen, Ruizhi
Du, Zidong
Hu, Xing
Zhang, Xishan
Li, Ling
Chen, Yunji
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[42] Conservative Data Sharing for Multi-Task Offline Reinforcement Learning
Yu, Tianhe
Kumar, Aviral
Chebotar, Yevgen
Hausman, Karol
Levine, Sergey
Finn, Chelsea
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[43] Discovering Synergies for Robot Manipulation with Multi-Task Reinforcement Learning
He, Zhanpeng
Ciocarlie, Matei
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 2714 - 2721
[44] PiCor: Multi-Task Deep Reinforcement Learning with Policy Correction
Bai, Fengshuo
Zhang, Hongming
Tao, Tianyang
Wu, Zhiheng
Wang, Yanna
Xu, Bo
[J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6728 - 6736
[45] A Multi-Task Reinforcement Learning Approach for Navigating Unsignalized Intersections
Kai, Shixiong
Wang, Bin
Chen, Dong
Hao, Jianye
Zhang, Hongbo
Liu, Wulong
[J]. 2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 1682 - 1687
[46] Multi-task Deep Reinforcement Learning: a Combination of Rainbow and DisTraL
Andalibi, Milad
Setoodeh, Peyman
Mansourieh, Ali
Asemani, Mohammad Hassan
[J]. 2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
[47] Prioritized Sampling with Intrinsic Motivation in Multi-Task Reinforcement Learning
D'Eramo, Carlo
Chalvatzaki, Georgia
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[48] A reinforcement learning algorithm in partially observable environments using short-term memory
Suematsu, N
Hayashi, A
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 1059 - 1065
[49] Multi-task Self-Supervised Adaptation for Reinforcement Learning
Wu, Keyu
Chen, Zhenghua
Wu, Min
Xiang, Shili
Jin, Ruibing
Zhang, Le
Li, Xiaoli
[J]. 2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 15 - 20
[50] Multi-Task Reinforcement Meta-Learning in Neural Networks
Shakah, Ghazi
[J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (07) : 263 - 269

← 1 2 3 4 5 →