Actor-Critic Algorithm with Transition Cost Estimation

被引：0

作者：

Sergey, Denisov ^{[1
]}

Lee, Jee-Hyong ^{[1
]}

机构：

[1] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon, South Korea

来源：

INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS | 2016年 / 16卷 / 04期

基金：

新加坡国家研究基金会;

关键词：

Actor-critic algorithm; Reinforcement learning; Continuous action space; Heuristic function;

D O I：

10.5391/IJFIS.2016.16.4.270

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We present an approach for acceleration actor-critic algorithm for reinforcement learning with continuous action space. Actor-critic algorithm has already proved its robustness to the infinitely large action spaces in various high dimensional environments. Despite that success, the main problem of the actor-critic algorithm remains the same-speed of convergence to the optimal policy. In high dimensional state and action space, a searching for the correct action in each state takes enormously long time. Therefore, in this paper we suggest a search accelerating function that allows to leverage speed of algorithm convergence and reach optimal policy faster. In our method, we assume that actions may have their own distribution of preference, that independent on the state. Since in the beginning of learning agent act randomly in the environment, it would be more efficient if actions were taken according to the some heuristic function. We demonstrate that heuristically-accelerated actor-critic algorithm learns optimal policy faster, using Educational Process Mining dataset with records of students' course learning process and their grades.

引用

页码：270 / 275

页数：6

共 50 条

[21] Actor-Critic Algorithm for Optimal Synchronization of Kuramoto Oscillator
Vrushabh, D.
Shalini, K.
Sonam, K.
[J]. 2020 7TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'20), VOL 1, 2020, : 391 - 396
[22] Variational actor-critic algorithms*,**
Zhu, Yuhua
Ying, Lexing
[J]. ESAIM-CONTROL OPTIMISATION AND CALCULUS OF VARIATIONS, 2023, 29
[23] Error controlled actor-critic
Gao, Xingen
Chao, Fei
Zhou, Changle
Ge, Zhen
Yang, Longzhi
Chang, Xiang
Shang, Changjing
Shen, Qiang
[J]. INFORMATION SCIENCES, 2022, 612 : 62 - 74
[24] Natural actor-critic algorithms
Bhatnagar, Shalabh
Sutton, Richard S.
Ghavamzadeh, Mohammad
Lee, Mark
[J]. AUTOMATICA, 2009, 45 (11) : 2471 - 2482
[25] Actor-Critic Instance Segmentation
Araslanov, Nikita
Rothkopf, Constantin A.
Roth, Stefan
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8229 - 8238
[26] A connectionist actor-critic algorithm for faster learning and biological plausibility
Johard, Leonard
Ruffaldi, Emanuele
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 3903 - 3909
[27] Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning
Zhong, Shan
Liu, Quan
Fu, QiMing
[J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016
[28] Procurement auctions using actor-critic type learning algorithm
Raju, CVL
Narahari, Y
Shah, S
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 4588 - 4594
[29] An accelerated asynchronous advantage actor-critic algorithm applied in papermaking
Wang, Xuechun
Zhuang, Zhiwei
Zou, Luobao
Zhang, Weidong
[J]. PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 8637 - 8642
[30] Efficient Actor-critic Algorithm with Dual Piecewise Model Learning
Zhong, Shan
Liu, Quan
Gong, Shengrong
Fu, Qiming
Xu, Jin
[J]. 2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 823 - 830

← 1 2 3 4 5 →