Reinforcement learning with constraint based on mirror descent algorithm

被引:1
|
作者
Miyashita, Megumi [1 ]
Kondo, Toshiyuki [2 ]
Yano, Shiro [2 ]
机构
[1] Tokyo Univ Agr & Technol, Grad Sch Engn, Dept Elect & Informat Engn, 2-24-16 Naka Cho, Koganei, Tokyo, Japan
[2] Tokyo Univ Agr & Technol, Inst Engn, Div Adv Informat Technol & Comp Sci, 2-24-16 Naka Cho, Koganei, Tokyo, Japan
来源
关键词
Constrained optimization; Mirror descent algorithm;
D O I
10.1016/j.rico.2021.100048
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
An important issue in reinforcement learning is to make the agent avoid the dangers and risks during the task such as physical collisions. We propose the reinforcement learning algorithm based on the CoMirror algorithm, named CoMDS, for the problem that has a functional constraint. Besides, we modify the proposed algorithm CoMDS to Gaussian CoMDS for practical use. We evaluate our algorithms with the via -point task of a planar robotic arm with a forbidden area, that employs as a constraint, in the simulator. As a result, we find that Gaussian CoMDS explores the policy while satisfying the constraint.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Node Constraint Routing Algorithm based on Reinforcement Learning
    Dong, Weihang
    Zhang, Wei
    Yang, Wei
    [J]. PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 1752 - 1756
  • [2] Robust Imitation via Mirror Descent Inverse Reinforcement Learning
    Han, Dong-Sig
    Kim, Hyunseo
    Lee, Hyundo
    Ryu, Je-Hwan
    Zhang, Byoung-Tak
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [3] Efficient Model-Based Concave Utility Reinforcement Learning through Greedy Mirror Descent
    Moreno, Bianca Marin
    Bregere, Margaux
    Gaillard, Pierre
    Oudjane, Nadia
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [4] POLICY MIRROR DESCENT FOR REGULARIZED REINFORCEMENT LEARNING: A GENERALIZED FRAMEWORK WITH LINEAR CONVERGENCE
    Zhan, Wenhao
    Cen, Shicong
    Huang, Baihe
    Chen, Yuxin
    Lee, Jason D.
    Chi, Yuejie
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2023, 33 (02) : 1061 - 1091
  • [5] Mirror Descent Learning in Continuous Games
    Zhou, Zhengyuan
    Mertikopoulos, Panayotis
    Moustakas, Aris L.
    Bambos, Nicholas
    Glynn, Peter
    [J]. 2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [6] Analysis of Online Composite Mirror Descent Algorithm
    Lei, Yunwen
    Zhou, Ding-Xuan
    [J]. NEURAL COMPUTATION, 2017, 29 (03) : 825 - 860
  • [7] Gradient descent for general reinforcement learning
    Baird, L
    Moore, A
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 968 - 974
  • [8] Policy mirror descent for reinforcement learning: linear convergence, new sampling complexity, and generalized problem classes
    Lan, Guanghui
    [J]. MATHEMATICAL PROGRAMMING, 2023, 198 (01) : 1059 - 1106
  • [9] Policy mirror descent for reinforcement learning: linear convergence, new sampling complexity, and generalized problem classes
    Guanghui Lan
    [J]. Mathematical Programming, 2023, 198 : 1059 - 1106
  • [10] Energy-Based Policy Constraint for Offline Reinforcement Learning
    Peng, Zhiyong
    Han, Changlin
    Liu, Yadong
    Zhou, Zongtan
    [J]. ARTIFICIAL INTELLIGENCE, CICAI 2023, PT II, 2024, 14474 : 335 - 346