Selector-Actor-Critic and Tuner-Actor-Critic Algorithms for Reinforcement Learning

被引:0
|
作者
Masadeh, Ala'eddin [1 ]
Wang, Zhengdao [1 ]
Kamal, Ahmed E. [1 ]
机构
[1] ISU, Ames, IA 50011 USA
基金
美国国家科学基金会;
关键词
Reinforcement learning; model-based learning; model-free learning; actor-critic; GAME; GO;
D O I
10.1109/wcsp.2019.8928124
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This work presents two reinforcement learning (RL) architectures, which mimic rational humans in the way of analyzing the available information and making decisions. The proposed algorithms are called selector-actor-critic (SAC) and tuner-actor-critic (TAC). They are obtained by modifying the well known actor-critic (AC) algorithm. SAC is equipped with an actor, a critic, and a selector. The role of the selector is to determine the most promising action at the current state based on the last estimate from the critic. TAC is model based, and consists of a tuner, a model-learner, an actor, and a critic. After receiving the approximated value of the current state-action pair from the critic and the learned model from the model-learner, the tuner uses the Bellman equation to tune the value of the current state-action pair. Then, this tuned value is used by the actor to optimize the policy. We investigate the performance of the proposed algorithms, and compare with AC algorithm to show the advantages of the proposed algorithms using numerical simulations.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Importance sampling actor-critic algorithms
    Williams, Jason L.
    Fisher, John W., III
    Willsky, Alan S.
    [J]. 2006 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2006, 1-12 : 1625 - +
  • [42] An Actor-Critic Algorithm With Second-Order Actor and Critic
    Wang, Jing
    Paschalidis, Ioannis Ch.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (06) : 2689 - 2703
  • [43] Toward Resilient Multi-Agent Actor-Critic Algorithms for Distributed Reinforcement Learning
    Lin, Yixuan
    Gade, Shripad
    Sandhu, Romeil
    Liu, Ji
    [J]. 2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 3953 - 3958
  • [44] Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for Robotics Control With Action Constraints
    Kasaura, Kazumi
    Miura, Shuwa
    Kozuno, Tadashi
    Yonetani, Ryo
    Hoshino, Kenta
    Hosoe, Yohei
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (08) : 4449 - 4456
  • [45] Actor-Critic Algorithms for Variance Minimization
    Awate, Yogesh P.
    [J]. TECHNOLOGICAL DEVELOPMENTS IN EDUCATION AND AUTOMATION, 2010, : 455 - 460
  • [46] Bias in Natural Actor-Critic Algorithms
    Thomas, Philip S.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 1), 2014, 32
  • [47] Forward Actor-Critic for Nonlinear Function Approximation in Reinforcement Learning
    Veeriah, Vivek
    van Seijen, Harm
    Sutton, Richard S.
    [J]. AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 556 - 564
  • [48] An Actor-Critic Hierarchical Reinforcement Learning Model for Course Recommendation
    Liang, Kun
    Zhang, Guoqiang
    Guo, Jinhui
    Li, Wentao
    [J]. ELECTRONICS, 2023, 12 (24)
  • [49] ACTOR-CRITIC DEEP REINFORCEMENT LEARNING FOR DYNAMIC MULTICHANNEL ACCESS
    Zhong, Chen
    Lu, Ziyang
    Gursoy, M. Cenk
    Velipasalar, Senem
    [J]. 2018 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2018), 2018, : 599 - 603
  • [50] THE APPLICATION OF ACTOR-CRITIC REINFORCEMENT LEARNING FOR FAB DISPATCHING SCHEDULING
    Kim, Namyong
    Shin, IIayong
    [J]. 2017 WINTER SIMULATION CONFERENCE (WSC), 2017, : 4570 - 4571