Learning Intention-Aware Policies in Deep Reinforcement Learning

被引：0

作者：

Zhao, T. ^{[1
]}

Wu, S. ^{[1
]}

Li, G. ^{[1
]}

Chen, Y. ^{[1
]}

Niu, G. ^{[2
]}

Sugiyama, Masashi ^{[2
,3
]}

机构：

[1] Tianjin Univ Sci & Technol, Coll Artificial Intelligence, Tianjin 300457, Peoples R China

[2] RIKEN Ctr Adv Intelligence Project, Tokyo 1030027, Japan

[3] Univ Tokyo, Grad Sch Frontier Sci, Tokyo 2778561, Japan

来源：

NEURAL COMPUTATION | 2023年 / 35卷 / 10期

基金：

中国国家自然科学基金;

关键词：

INFORMATION; CURIOSITY;

D O I：

10.1162/neco_a_01607

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep reinforcement learning (DRL) provides an agent with an optimal policy so as to maximize the cumulative rewards. The policy defined in DRL mainly depends on the state, historical memory, and policy model parameters. However, we humans usually take actions according to our own intentions, such as moving fast or slow, besides the elements included in the traditional policy models. In order to make the action-choosing mechanism more similar to humans and make the agent to select actions that incorporate intentions, we propose an intention-aware policy learning method in this letter To formalize this process, we first define an intention-aware policy by incorporating the intention information into the policy model, which is learned by maximizing the cumulative rewards with the mutual information (MI) between the intention and the action. Then we derive an approximation of the MI objective that can be optimized efficiently. Finally, we demonstrate the effectiveness of the intention-aware policy in the classical MuJoCo control task and the multigoal continuous chain walking task.

引用

页码：1657 / 1677

页数：21

共 50 条

[21] Visuotactile-RL: Learning Multimodal Manipulation Policies with Deep Reinforcement Learning
Hansen, Johanna
Hogan, Francois
Rivkin, Dmitriy
Meger, David
Jenkin, Michael
Dudek, Gregory
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 8298 - 8304
[22] Learning Distributed Cooperative Policies For Security Games via Deep Reinforcement Learning
Sheikh, Hassam Ullah
Razghandi, Mina
Boloni, Ladislau
[J]. 2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2019, : 489 - 494
[23] Deep Adaptive Multi-intention Inverse Reinforcement Learning
Bighashdel, Ariyan
Meletis, Panagiotis
Jancura, Pavol
Dubbelman, Gijs
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, 2021, 12975 : 206 - 221
[24] Learning Curriculum Policies for Reinforcement Learning
Narvekar, Sanmit
Stone, Peter
[J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 25 - 33
[25] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
Morales, Eduardo F.
Murrieta-Cid, Rafael
Becerra, Israel
Esquivel-Basaldua, Marco A.
[J]. INTELLIGENT SERVICE ROBOTICS, 2021, 14 (05) : 773 - 805
[26] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
Eduardo F. Morales
Rafael Murrieta-Cid
Israel Becerra
Marco A. Esquivel-Basaldua
[J]. Intelligent Service Robotics, 2021, 14 : 773 - 805
[27] An intention-aware interface for services access enhancement
Lee, Chiung-Hon Leon
Liu, Alan
[J]. IEEE INTERNATIONAL CONFERENCE ON SENSOR NETWORKS, UBIQUITOUS, AND TRUSTWORTHY COMPUTING, VOL 2, PROCEEDINGS, 2006, : 52 - +
[28] Intention-aware motion planning with road rules
Karlsson, Jesper
Tumova, Jana
[J]. 2020 IEEE 16TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2020, : 526 - 532
[29] The Advance of Reinforcement Learning and Deep Reinforcement Learning
Lyu, Le
Shen, Yang
Zhang, Sicheng
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, BIG DATA AND ALGORITHMS (EEBDA), 2022, : 644 - 648
[30] Intention-Aware Risk Estimation: Field Results
Lefevre, Stephanie
Vasquez, Dizan
Laugier, Christian
Ibanez-Guzman, Javier
[J]. 2015 IEEE INTERNATIONAL WORKSHOP ON ADVANCED ROBOTICS AND ITS SOCIAL IMPACTS (ARSO), 2015,

← 1 2 3 4 5 →