Actor-Critic Algorithm with Transition Cost Estimation

被引：0

作者：

Sergey, Denisov ^{[1
]}

Lee, Jee-Hyong ^{[1
]}

机构：

[1] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon, South Korea

来源：

INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS | 2016年 / 16卷 / 04期

基金：

新加坡国家研究基金会;

关键词：

Actor-critic algorithm; Reinforcement learning; Continuous action space; Heuristic function;

D O I：

10.5391/IJFIS.2016.16.4.270

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We present an approach for acceleration actor-critic algorithm for reinforcement learning with continuous action space. Actor-critic algorithm has already proved its robustness to the infinitely large action spaces in various high dimensional environments. Despite that success, the main problem of the actor-critic algorithm remains the same-speed of convergence to the optimal policy. In high dimensional state and action space, a searching for the correct action in each state takes enormously long time. Therefore, in this paper we suggest a search accelerating function that allows to leverage speed of algorithm convergence and reach optimal policy faster. In our method, we assume that actions may have their own distribution of preference, that independent on the state. Since in the beginning of learning agent act randomly in the environment, it would be more efficient if actions were taken according to the some heuristic function. We demonstrate that heuristically-accelerated actor-critic algorithm learns optimal policy faster, using Educational Process Mining dataset with records of students' course learning process and their grades.

引用

页码：270 / 275

页数：6

共 50 条

[1] A Hessian Actor-Critic Algorithm
Wang, Jing
Paschalidis, Ioannis Ch
[J]. 2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 1131 - 1136
[2] An Actor-Critic Algorithm With Second-Order Actor and Critic
Wang, Jing
Paschalidis, Ioannis Ch.
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (06) : 2689 - 2703
[3] THE ACTOR-CRITIC ALGORITHM FOR INFINITE HORIZON DISCOUNTED COST REVISITED
Gosavi, Abhijit
[J]. 2020 WINTER SIMULATION CONFERENCE (WSC), 2020, : 2867 - 2878
[4] An Actor-Critic Algorithm for SVM Hyperparameters
Kim, Chayoung
Park, Jung-min
Kim, Hye-young
[J]. INFORMATION SCIENCE AND APPLICATIONS 2018, ICISA 2018, 2019, 514 : 653 - 661
[5] A sensitivity formula for risk-sensitive cost and the actor-critic algorithm
Borkar, VS
[J]. SYSTEMS & CONTROL LETTERS, 2001, 44 (05) : 339 - 346
[6] A Finite Sample Analysis of the Actor-Critic Algorithm
Yang, Zhuoran
Zhang, Kaiqing
Hong, Mingyi
Basar, Tamer
[J]. 2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 2759 - 2764
[7] Adaptive Advantage Estimation for Actor-Critic Algorithms
Chen, Yurou
Zhang, Fengyi
Liu, Zhiyong
[J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[8] The Effect of Discounting Actor-loss in Actor-Critic Algorithm
Yaputra, Jordi
Suyanto, Suyanto
[J]. 2021 4TH INTERNATIONAL SEMINAR ON RESEARCH OF INFORMATION TECHNOLOGY AND INTELLIGENT SYSTEMS (ISRITI 2021), 2020,
[9] A modified actor-critic reinforcement learning algorithm
Mustapha, SM
Lachiver, G
[J]. 2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 605 - 609
[10] Actor-critic algorithms
Konda, VR
Tsitsiklis, JN
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1008 - 1014

← 1 2 3 4 5 →