OptLayer - Practical Constrained Optimization for Deep Reinforcement Learning in the Real World

被引：0

作者：

Tu-Hoa Pham ^{[1
]}

De Magistris, Giovanni ^{[1
]}

Tachibana, Ryuki ^{[1
]}

机构：

[1] IBM Res AI, Tokyo, Japan

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) | 2018年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

While deep reinforcement learning techniques have recently produced considerable achievements on many decision-making problems, their use in robotics has largely been limited to simulated worlds or restricted motions, since unconstrained trial-and-error interactions in the real world can have undesirable consequences for the robot or its environment. To overcome such limitations, we propose a novel reinforcement learning architecture, OptLayer, that takes as inputs possibly unsafe actions predicted by a neural network and outputs the closest actions that satisfy chosen constraints. While learning control policies often requires carefully crafted rewards and penalties while exploring the range of possible actions, OptLayer ensures that only safe actions are actually executed and unsafe predictions are penalized during training. We demonstrate the effectiveness of our approach on robot reaching tasks, both simulated and in the real world.

引用

页码：6236 / 6243

页数：8

共 50 条

[41] Model-Based Reinforcement Learning in Continuous Environments Using Real-Time Constrained Optimization
Andersson, Olov
Heintz, Fredrik
Doherty, Patrick
PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2497 - 2503
[42] Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
Jayant, Ashish Kumar
Bhatnagar, Shalabh
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[43] Deep reinforcement learning assisted surrogate model management for expensive constrained multi-objective optimization
Shao, Shuai
Tian, Ye
Zhang, Yajie
SWARM AND EVOLUTIONARY COMPUTATION, 2025, 92
[44] Myriad: a real-world testbed to bridge trajectory optimization and deep learning
Howe, Nikolaus H.R.
Dufort-Labbé, Simon
Rajkumar, Nitarshan
Bacon, Pierre-Luc
arXiv, 2022,
[45] Myriad: a real-world testbed to bridge trajectory optimization and deep learning
Howe, Nikolaus H. R.
Dufort-Labbe, Simon
Rajkumar, Nitarshan
Bacon, Pierre-Luc
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[46] Deep reinforcement learning assisted novelty search in Voronoi regions for constrained multi-objective optimization
Yang, Yufei
Zhang, Changsheng
Liu, Yi
Ning, Jiaxu
Guo, Ying
SWARM AND EVOLUTIONARY COMPUTATION, 2024, 91
[47] Chance-Constrained Control With Lexicographic Deep Reinforcement Learning
Giuseppi, Alessandro
Pietrabissa, Antonio
IEEE CONTROL SYSTEMS LETTERS, 2020, 4 (03): : 755 - 760
[48] Ship Collision Avoidance Using Constrained Deep Reinforcement Learning
Zhang, Rui
Wang, Xiao
Liu, Kezhong
Wu, Xiaolie
Lu, Tianyou
Chao Zhaohui
2018 5TH INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC, AND SOCIO-CULTURAL COMPUTING (BESC), 2018, : 115 - 120
[49] Deep Learning with Requirements in the Real World
Stoian, Mihaela Catalina
PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 8508 - 8509
[50] Deep reinforcement learning towards real-world dynamic thermal management of data centers
Zhang, Qingang
Zeng, Wei
Lin, Qinjie
Chng, Chin-Boon
Chui, Chee-Kong
Lee, Poh-Seng
APPLIED ENERGY, 2023, 333

← 1 2 3 4 5 →