Evaluating Correctness of Reinforcement Learning based on Actor-Critic Algorithm

被引：0

作者：

Kim, Youngjae ^{[1
]}

Hussain, Manzoor ^{[1
]}

Suh, Jae-Won ^{[1
]}

Hong, Jang-Eui ^{[1
]}

机构：

[1] Chungbuk Natl Univ, Coll Elect & Comp Engn, Cheongju, South Korea

来源：

2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN) | 2022年

基金：

新加坡国家研究基金会;

关键词：

reinforcement learning; actor-critic algorithm; safety-critical system; quality evaluation; correctness;

D O I：

10.1109/ICUFN55119.2022.9829571

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep learning is used for decision making and functional control in various fields, such as autonomous systems. However, rather than being developed by logical design, deep learning models are trained by itself through learning data. Moreover, only reward values are used to evaluate its performance, which does not provide enough information that the model learned properly. This paper proposes a new method to assess the correctness of reinforcement learning, considering other properties of the learning algorithm. The proposed method is applied for the evaluation of ActorCritic Algorithms, and correctness-related insights of the algorithm are confirmed through experiments.

引用

页码：320 / 325

页数：6

共 50 条

[41] Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Zanette, Andrea
Wainwright, Martin J.
Brunskill, Emma
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[42] Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space
Fan, Zhou
Su, Rui
Zhang, Weinan
Yu, Yong
[J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2279 - 2285
[43] Actor-critic reinforcement learning for the feedback control of a swinging chain
Dengler, C.
Lohmann, B.
[J]. IFAC PAPERSONLINE, 2018, 51 (13): : 378 - 383
[44] A Prioritized objective actor-critic method for deep reinforcement learning
Ngoc Duy Nguyen
Thanh Thi Nguyen
Peter Vamplew
Richard Dazeley
Saeid Nahavandi
[J]. Neural Computing and Applications, 2021, 33 : 10335 - 10349
[45] Dual Variable Actor-Critic for Adaptive Safe Reinforcement Learning
Lee, Junseo
Heo, Jaeseok
Kim, Dohyeong
Lee, Gunmin
Oh, Songhwai
[J]. 2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 7568 - 7573
[46] Dynamic Charging Scheme Problem With Actor-Critic Reinforcement Learning
Yang, Meiyi
Liu, Nianbo
Zuo, Lin
Feng, Yong
Liu, Minghui
Gong, Haigang
Liu, Ming
[J]. IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (01): : 370 - 380
[47] A Prioritized objective actor-critic method for deep reinforcement learning
Nguyen, Ngoc Duy
Nguyen, Thanh Thi
Vamplew, Peter
Dazeley, Richard
Nahavandi, Saeid
[J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (16): : 10335 - 10349
[48] Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation
Zhou, Ruida
Liu, Tao
Cheng, Min
Kalathil, Dileep
Kumar, P. R.
Tian, Chao
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[49] An Actor-Critic Algorithm With Second-Order Actor and Critic
Wang, Jing
Paschalidis, Ioannis Ch.
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (06) : 2689 - 2703
[50] A model-based hybrid soft actor-critic deep reinforcement learning algorithm for optimal ventilator settings
Chen, Shaotao
Qiu, Xihe
Tan, Xiaoyu
Fang, Zhijun
Jin, Yaochu
[J]. INFORMATION SCIENCES, 2022, 611 : 47 - 64

← 1 2 3 4 5 →