Reinforcement Learning with Reward Shaping and Hybrid Exploration in Sparse Reward Scenes

被引：0

作者：

Yang, Yulong ^{[1
]}

Cao, Weihua

Guo, Linwei

Gan, Chao

Wu, Min

机构：

[1] China Univ Geosci, Sch Automat, Wuhan 430074, Peoples R China

来源：

2023 IEEE 6TH INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER-PHYSICAL SYSTEMS, ICPS | 2023年

基金：

中国国家自然科学基金;

关键词：

reinforcement learning; sparse reward; reward shaping; hybrid exploration;

D O I：

10.1109/ICPS58381.2023.10128012

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

High precision modeling in industrial systems is difficult and costly. Model-free intelligent control methods, represented by reinforcement learning, have been applied in industrial systems broadly. The hard evaluated of production states and the low value density of processing data causes sparse rewards, which lead to an insufficient performance of reinforcement learning. To overcome the difficulty of reinforcement learning in sparse reward scenes, a reinforcement learning method with reward shaping and hybrid exploration is proposed. By perfecting the rewards distribution in the state space of environment, the reward shaping can make the state-value estimation of reinforcement learning more accurate. By improving the rewards distribution in time dimension, the hybrid exploration can make the iteration of reinforcement learning more efficient and more stable. Finally, the effectiveness of the proposed method is verified by simulations.

引用

页数：6

共 50 条

[1] Exploration-Guided Reward Shaping for Reinforcement Learning under Sparse Rewards
Devidze, Rati
Kamalaruban, Parameswaran
Singla, Adish
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[2] Reward Shaping from Hybrid Systems Models in Reinforcement Learning
Qian, Marian
Mitsch, Stefan
[J]. NASA FORMAL METHODS, NFM 2023, 2023, 13903 : 122 - 139
[3] Belief Reward Shaping in Reinforcement Learning
Marom, Ofir
Rosman, Benjamin
[J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3762 - 3769
[4] Multigrid Reinforcement Learning with Reward Shaping
Grzes, Marek
Kudenko, Daniel
[J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT I, 2008, 5163 : 357 - 366
[5] Reward Shaping in Episodic Reinforcement Learning
Grzes, Marek
[J]. AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 565 - 573
[6] Reward Shaping for Reinforcement Learning by Emotion Expressions
Hwang, K. S.
Ling, J. L.
Chen, Yu-Ying
Wang, Wei-Han
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 1288 - 1293
[7] Hindsight Reward Shaping in Deep Reinforcement Learning
de Villiers, Byron
Sabatta, Deon
[J]. 2020 INTERNATIONAL SAUPEC/ROBMECH/PRASA CONFERENCE, 2020, : 653 - 659
[8] Reward Shaping Based Federated Reinforcement Learning
Hu, Yiqiu
Hua, Yun
Liu, Wenyan
Zhu, Jun
[J]. IEEE ACCESS, 2021, 9 : 67259 - 67267
[9] Hybrid Reward Architecture for Reinforcement Learning
van Seijen, Harm
Fatemi, Mehdi
Romoff, Joshua
Laroche, Romain
Barnes, Tavian
Tsang, Jeffrey
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[10] Reward-Free Exploration for Reinforcement Learning
Jin, Chi
Krishnamurthy, Akshay
Simchowitz, Max
Yu, Tiancheng
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119

← 1 2 3 4 5 →