Actor-Critic Algorithm with Transition Cost Estimation

被引:0
|
作者
Sergey, Denisov [1 ]
Lee, Jee-Hyong [1 ]
机构
[1] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon, South Korea
基金
新加坡国家研究基金会;
关键词
Actor-critic algorithm; Reinforcement learning; Continuous action space; Heuristic function;
D O I
10.5391/IJFIS.2016.16.4.270
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present an approach for acceleration actor-critic algorithm for reinforcement learning with continuous action space. Actor-critic algorithm has already proved its robustness to the infinitely large action spaces in various high dimensional environments. Despite that success, the main problem of the actor-critic algorithm remains the same-speed of convergence to the optimal policy. In high dimensional state and action space, a searching for the correct action in each state takes enormously long time. Therefore, in this paper we suggest a search accelerating function that allows to leverage speed of algorithm convergence and reach optimal policy faster. In our method, we assume that actions may have their own distribution of preference, that independent on the state. Since in the beginning of learning agent act randomly in the environment, it would be more efficient if actions were taken according to the some heuristic function. We demonstrate that heuristically-accelerated actor-critic algorithm learns optimal policy faster, using Educational Process Mining dataset with records of students' course learning process and their grades.
引用
收藏
页码:270 / 275
页数:6
相关论文
共 50 条
  • [21] Actor-Critic Algorithm for Optimal Synchronization of Kuramoto Oscillator
    Vrushabh, D.
    Shalini, K.
    Sonam, K.
    [J]. 2020 7TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'20), VOL 1, 2020, : 391 - 396
  • [22] Variational actor-critic algorithms*,**
    Zhu, Yuhua
    Ying, Lexing
    [J]. ESAIM-CONTROL OPTIMISATION AND CALCULUS OF VARIATIONS, 2023, 29
  • [23] Error controlled actor-critic
    Gao, Xingen
    Chao, Fei
    Zhou, Changle
    Ge, Zhen
    Yang, Longzhi
    Chang, Xiang
    Shang, Changjing
    Shen, Qiang
    [J]. INFORMATION SCIENCES, 2022, 612 : 62 - 74
  • [24] Natural actor-critic algorithms
    Bhatnagar, Shalabh
    Sutton, Richard S.
    Ghavamzadeh, Mohammad
    Lee, Mark
    [J]. AUTOMATICA, 2009, 45 (11) : 2471 - 2482
  • [25] Actor-Critic Instance Segmentation
    Araslanov, Nikita
    Rothkopf, Constantin A.
    Roth, Stefan
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8229 - 8238
  • [26] A connectionist actor-critic algorithm for faster learning and biological plausibility
    Johard, Leonard
    Ruffaldi, Emanuele
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 3903 - 3909
  • [27] Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning
    Zhong, Shan
    Liu, Quan
    Fu, QiMing
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016
  • [28] Procurement auctions using actor-critic type learning algorithm
    Raju, CVL
    Narahari, Y
    Shah, S
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 4588 - 4594
  • [29] An accelerated asynchronous advantage actor-critic algorithm applied in papermaking
    Wang, Xuechun
    Zhuang, Zhiwei
    Zou, Luobao
    Zhang, Weidong
    [J]. PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 8637 - 8642
  • [30] Efficient Actor-critic Algorithm with Dual Piecewise Model Learning
    Zhong, Shan
    Liu, Quan
    Gong, Shengrong
    Fu, Qiming
    Xu, Jin
    [J]. 2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 823 - 830