Policy gradient fuzzy reinforcement learning

被引:0
|
作者
Wang, XN [1 ]
Xu, X [1 ]
He, HG [1 ]
机构
[1] Natl Univ Def Technol, Inst Automat, Changsha 410073, Peoples R China
来源
PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2004年
关键词
reinforcement learning; fuzzy control; policy gradient; gradient estimate;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new approach for tuning conclusions of fuzzy rules based on reinforcement learning. Unlike the most of existing fuzzy reinforcement learning algorithms which are based on value function, while our approach called policy gradient fuzzy reinforcement learning (PGFRL) bases on gradient estimate. In PGFRL, The algorithm GPOMDP is employed to estimate the performance gradient with respect to the parameters of fuzzy rules. In our work we prove the convergence of fuzzy rules' parameters to a local optimum given necessary conditions. The experiment results show the effectiveness of PGFRL.
引用
收藏
页码:992 / 995
页数:4
相关论文
共 50 条
  • [21] A Residual Gradient Fuzzy Reinforcement Learning Algorithm for Differential Games
    Mostafa D. Awheda
    Howard M. Schwartz
    International Journal of Fuzzy Systems, 2017, 19 : 1058 - 1076
  • [22] Robot reinforcement learning accuracy-based learning classifier systems with Fuzzy Policy Gradient descent(XCS-FPGRL)
    Shao, Jie
    Yu, Jingru
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCES IN MECHANICAL ENGINEERING AND INDUSTRIAL INFORMATICS, 2015, 15 : 1013 - 1018
  • [23] Molecule generation using transformers and policy gradient reinforcement learning
    Mazuz, Eyal
    Shtar, Guy
    Shapira, Bracha
    Rokach, Lior
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [24] Reinforcement Learning based on MPC and the Stochastic Policy Gradient Method
    Gros, Sebastien
    Zanon, Mario
    2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1947 - 1952
  • [25] Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning
    Morimura, Tetsuro
    Uchibe, Eiji
    Yoshimoto, Junichiro
    Peters, Jan
    Doya, Kenji
    NEURAL COMPUTATION, 2010, 22 (02) : 342 - 376
  • [26] Hessian matrix distribution for Bayesian policy gradient reinforcement learning
    Ngo Anh Vien
    Yu, Hwanjo
    Chung, TaeChoong
    INFORMATION SCIENCES, 2011, 181 (09) : 1671 - 1685
  • [27] Using policy gradient reinforcement learning on autonomous robot controllers
    Grudic, GZ
    Kumar, V
    Ungar, L
    IROS 2003: PROCEEDINGS OF THE 2003 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2003, : 406 - 411
  • [28] Spiking Variational Policy Gradient for Brain Inspired Reinforcement Learning
    Yang, Zhile
    Guo, Shangqi
    Fang, Ying
    Yu, Zhaofei
    Liu, Jian K.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (03) : 1975 - 1990
  • [29] KERNEL-BASED LIFELONG POLICY GRADIENT REINFORCEMENT LEARNING
    Mowakeaa, Rami
    Kim, Seung-Jun
    Emge, Darren K.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3500 - 3504
  • [30] Cold-Start Reinforcement Learning with Softmax Policy Gradient
    Ding, Nan
    Soricut, Radu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30