A Max-Min Entropy Framework for Reinforcement Learning

被引：0

作者：

Han, Seungyul ^{[1
]}

Sung, Youngchul ^{[2
]}

机构：

[1] UNIST, Grad Sch Artificial Intelligence, Ulsan 44919, South Korea

[2] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 34141, South Korea

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a max-min entropy framework for reinforcement learning (RL) to overcome the limitation of the soft actor-critic (SAC) algorithm implementing the maximum entropy RL in model-free sample-based learning. Whereas the maximum entropy RL guides learning for policies to reach states with high entropy in the future, the proposed max-min entropy framework aims to learn to visit states with low entropy and maximize the entropy of these low-entropy states to promote better exploration. For general Markov decision processes (MDPs), an efficient algorithm is constructed under the proposed max-min entropy framework based on disentanglement of exploration and exploitation. Numerical results show that the proposed algorithm yields drastic performance improvement over the current state-of-the-art RL algorithms.

引用

页数：14

共 50 条

[21] Enhanced Max-Min Rate of Users in UAV-Assisted Emergency Networks Using Reinforcement Learning
Kaleem, Zeeshan
Ahmad, Ayaz
Chughtai, Omer
Rodrigues, Joel J. P. C.
IEEE Networking Letters, 2022, 4 (03): : 104 - 107
[22] MAX-MIN Ant System
Stützle, T
Hoos, HH
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2000, 16 (08): : 889 - 914
[23] Max-Min Greedy Matching
Eden, Alon
Feige, Urid
Feldman, Michal
THEORY OF COMPUTING, 2022, 18
[24] Orbits in max-min algebra
Semancíková, B
LINEAR ALGEBRA AND ITS APPLICATIONS, 2006, 414 (01) : 38 - 63
[25] On minimization of max-min functions
Bagirov, AM
Rubinov, AM
Optimization And Control With Applications, 2005, 96 : 3 - 33
[26] The structure of max-min hyperplanes
Nitica, V.
LINEAR ALGEBRA AND ITS APPLICATIONS, 2010, 432 (01) : 402 - 429
[27] Upward Max-Min Fairness
Danna, Emilie
Hassidim, Avinatan
Kaplan, Haim
Kumar, Alok
Mansour, Yishay
Raz, Danny
Segalov, Michal
JOURNAL OF THE ACM, 2017, 64 (01) : 1 - 24
[28] Max-Min Processors Scheduling
Alquhayz, Hani
Jemmali, Mandi
INFORMATION TECHNOLOGY AND CONTROL, 2021, 50 (01): : 5 - 12
[29] MAX-MIN TREE PARTITIONING
PERL, Y
SCHACH, SR
JOURNAL OF THE ACM, 1981, 28 (01) : 5 - 15
[30] Max-max, max-min, min-max and min-min knapsack problems with a parametric constraint
Halman, Nir
Kovalyov, Mikhail Y.
Quilliot, Alain
4OR-A QUARTERLY JOURNAL OF OPERATIONS RESEARCH, 2023, 21 (02): : 235 - 246

← 1 2 3 4 5 →