Automated State Feature Learning for Actor-Critic Reinforcement Learning through NEAT

被引：2

作者：

Peng, Yiming ^{[1
]}

Chen, Gang ^{[1
]}

Holdaway, Scott ^{[1
]}

Mei, Yi ^{[1
]}

Zhang, Mengjie ^{[1
]}

机构：

[1] Victoria Univ Wellington, Sch Engn & Comp Sci, Wellington, New Zealand

来源：

PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCO'17 COMPANION) | 2017年

关键词：

NeuroEvolution; NEAT; Actor-Critic; Reinforcement Learning; Feature Extraction; Feature Learning;

D O I：

10.1145/3067695.3076035

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Actor-Critic (AC) algorithms are important approaches to solving sophisticated reinforcement learning problems. However, the learning performance of these algorithms rely heavily on good state features that are often designed manually. To address this issue, we propose to adopt an evolutionary approach based on NeuroEvolution of Augmenting Topology (NEAT) to automatically evolve neural networks that directly transform the raw environmental inputs into state features. Following this idea, we have successfully developed a new algorithm called NEAT+AC which combines Regular-gradient Actor-Critic (RAC) with NEAT. It can simultaneously learn suitable state features as well as good policies that are expected to significantly improve the reinforcement learning performance. Preliminary experiments on two benchmark problems confirm that our new algorithm can clearly outperform the baseline algorithm, i.e., NEAT.

引用

页码：135 / 136

页数：2

共 50 条

[31] Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space
Fan, Zhou
Su, Rui
Zhang, Weinan
Yu, Yong
[J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2279 - 2285
[32] Manipulator Motion Planning based on Actor-Critic Reinforcement Learning
Li, Qiang
Nie, Jun
Wang, Haixia
Lu, Xiao
Song, Shibin
[J]. 2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 4248 - 4254
[33] Actor-critic reinforcement learning for the feedback control of a swinging chain
Dengler, C.
Lohmann, B.
[J]. IFAC PAPERSONLINE, 2018, 51 (13): : 378 - 383
[34] A Prioritized objective actor-critic method for deep reinforcement learning
Ngoc Duy Nguyen
Thanh Thi Nguyen
Peter Vamplew
Richard Dazeley
Saeid Nahavandi
[J]. Neural Computing and Applications, 2021, 33 : 10335 - 10349
[35] Dual Variable Actor-Critic for Adaptive Safe Reinforcement Learning
Lee, Junseo
Heo, Jaeseok
Kim, Dohyeong
Lee, Gunmin
Oh, Songhwai
[J]. 2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 7568 - 7573
[36] Dynamic Charging Scheme Problem With Actor-Critic Reinforcement Learning
Yang, Meiyi
Liu, Nianbo
Zuo, Lin
Feng, Yong
Liu, Minghui
Gong, Haigang
Liu, Ming
[J]. IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (01): : 370 - 380
[37] Evaluating Correctness of Reinforcement Learning based on Actor-Critic Algorithm
Kim, Youngjae
Hussain, Manzoor
Suh, Jae-Won
Hong, Jang-Eui
[J]. 2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 320 - 325
[38] A Prioritized objective actor-critic method for deep reinforcement learning
Nguyen, Ngoc Duy
Nguyen, Thanh Thi
Vamplew, Peter
Dazeley, Richard
Nahavandi, Saeid
[J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (16): : 10335 - 10349
[39] Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation
Zhou, Ruida
Liu, Tao
Cheng, Min
Kalathil, Dileep
Kumar, P. R.
Tian, Chao
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[40] Learning State Representation for Deep Actor-Critic Control
Munk, Jelle
Kober, Jens
Babuska, Robert
[J]. 2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 4667 - 4673

← 1 2 3 4 5 →