OptLayer - Practical Constrained Optimization for Deep Reinforcement Learning in the Real World

被引:0
|
作者
Tu-Hoa Pham [1 ]
De Magistris, Giovanni [1 ]
Tachibana, Ryuki [1 ]
机构
[1] IBM Res AI, Tokyo, Japan
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) | 2018年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
While deep reinforcement learning techniques have recently produced considerable achievements on many decision-making problems, their use in robotics has largely been limited to simulated worlds or restricted motions, since unconstrained trial-and-error interactions in the real world can have undesirable consequences for the robot or its environment. To overcome such limitations, we propose a novel reinforcement learning architecture, OptLayer, that takes as inputs possibly unsafe actions predicted by a neural network and outputs the closest actions that satisfy chosen constraints. While learning control policies often requires carefully crafted rewards and penalties while exploring the range of possible actions, OptLayer ensures that only safe actions are actually executed and unsafe predictions are penalized during training. We demonstrate the effectiveness of our approach on robot reaching tasks, both simulated and in the real world.
引用
收藏
页码:6236 / 6243
页数:8
相关论文
共 50 条
  • [41] Model-Based Reinforcement Learning in Continuous Environments Using Real-Time Constrained Optimization
    Andersson, Olov
    Heintz, Fredrik
    Doherty, Patrick
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2497 - 2503
  • [42] Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
    Jayant, Ashish Kumar
    Bhatnagar, Shalabh
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [43] Deep reinforcement learning assisted surrogate model management for expensive constrained multi-objective optimization
    Shao, Shuai
    Tian, Ye
    Zhang, Yajie
    SWARM AND EVOLUTIONARY COMPUTATION, 2025, 92
  • [44] Myriad: a real-world testbed to bridge trajectory optimization and deep learning
    Howe, Nikolaus H.R.
    Dufort-Labbé, Simon
    Rajkumar, Nitarshan
    Bacon, Pierre-Luc
    arXiv, 2022,
  • [45] Myriad: a real-world testbed to bridge trajectory optimization and deep learning
    Howe, Nikolaus H. R.
    Dufort-Labbe, Simon
    Rajkumar, Nitarshan
    Bacon, Pierre-Luc
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [46] Deep reinforcement learning assisted novelty search in Voronoi regions for constrained multi-objective optimization
    Yang, Yufei
    Zhang, Changsheng
    Liu, Yi
    Ning, Jiaxu
    Guo, Ying
    SWARM AND EVOLUTIONARY COMPUTATION, 2024, 91
  • [47] Chance-Constrained Control With Lexicographic Deep Reinforcement Learning
    Giuseppi, Alessandro
    Pietrabissa, Antonio
    IEEE CONTROL SYSTEMS LETTERS, 2020, 4 (03): : 755 - 760
  • [48] Ship Collision Avoidance Using Constrained Deep Reinforcement Learning
    Zhang, Rui
    Wang, Xiao
    Liu, Kezhong
    Wu, Xiaolie
    Lu, Tianyou
    Chao Zhaohui
    2018 5TH INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC, AND SOCIO-CULTURAL COMPUTING (BESC), 2018, : 115 - 120
  • [49] Deep Learning with Requirements in the Real World
    Stoian, Mihaela Catalina
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 8508 - 8509
  • [50] Deep reinforcement learning towards real-world dynamic thermal management of data centers
    Zhang, Qingang
    Zeng, Wei
    Lin, Qinjie
    Chng, Chin-Boon
    Chui, Chee-Kong
    Lee, Poh-Seng
    APPLIED ENERGY, 2023, 333