A Max-Min Entropy Framework for Reinforcement Learning

被引:0
|
作者
Han, Seungyul [1 ]
Sung, Youngchul [2 ]
机构
[1] UNIST, Grad Sch Artificial Intelligence, Ulsan 44919, South Korea
[2] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 34141, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a max-min entropy framework for reinforcement learning (RL) to overcome the limitation of the soft actor-critic (SAC) algorithm implementing the maximum entropy RL in model-free sample-based learning. Whereas the maximum entropy RL guides learning for policies to reach states with high entropy in the future, the proposed max-min entropy framework aims to learn to visit states with low entropy and maximize the entropy of these low-entropy states to promote better exploration. For general Markov decision processes (MDPs), an efficient algorithm is constructed under the proposed max-min entropy framework based on disentanglement of exploration and exploitation. Numerical results show that the proposed algorithm yields drastic performance improvement over the current state-of-the-art RL algorithms.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Enhanced Max-Min Rate of Users in UAV-Assisted Emergency Networks Using Reinforcement Learning
    Kaleem, Zeeshan
    Ahmad, Ayaz
    Chughtai, Omer
    Rodrigues, Joel J. P. C.
    IEEE Networking Letters, 2022, 4 (03): : 104 - 107
  • [22] MAX-MIN Ant System
    Stützle, T
    Hoos, HH
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2000, 16 (08): : 889 - 914
  • [23] Max-Min Greedy Matching
    Eden, Alon
    Feige, Urid
    Feldman, Michal
    THEORY OF COMPUTING, 2022, 18
  • [24] Orbits in max-min algebra
    Semancíková, B
    LINEAR ALGEBRA AND ITS APPLICATIONS, 2006, 414 (01) : 38 - 63
  • [25] On minimization of max-min functions
    Bagirov, AM
    Rubinov, AM
    Optimization And Control With Applications, 2005, 96 : 3 - 33
  • [26] The structure of max-min hyperplanes
    Nitica, V.
    LINEAR ALGEBRA AND ITS APPLICATIONS, 2010, 432 (01) : 402 - 429
  • [27] Upward Max-Min Fairness
    Danna, Emilie
    Hassidim, Avinatan
    Kaplan, Haim
    Kumar, Alok
    Mansour, Yishay
    Raz, Danny
    Segalov, Michal
    JOURNAL OF THE ACM, 2017, 64 (01) : 1 - 24
  • [28] Max-Min Processors Scheduling
    Alquhayz, Hani
    Jemmali, Mandi
    INFORMATION TECHNOLOGY AND CONTROL, 2021, 50 (01): : 5 - 12
  • [29] MAX-MIN TREE PARTITIONING
    PERL, Y
    SCHACH, SR
    JOURNAL OF THE ACM, 1981, 28 (01) : 5 - 15
  • [30] Max-max, max-min, min-max and min-min knapsack problems with a parametric constraint
    Halman, Nir
    Kovalyov, Mikhail Y.
    Quilliot, Alain
    4OR-A QUARTERLY JOURNAL OF OPERATIONS RESEARCH, 2023, 21 (02): : 235 - 246