A Max-Min Entropy Framework for Reinforcement Learning

被引:0
|
作者
Han, Seungyul [1 ]
Sung, Youngchul [2 ]
机构
[1] UNIST, Grad Sch Artificial Intelligence, Ulsan 44919, South Korea
[2] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 34141, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a max-min entropy framework for reinforcement learning (RL) to overcome the limitation of the soft actor-critic (SAC) algorithm implementing the maximum entropy RL in model-free sample-based learning. Whereas the maximum entropy RL guides learning for policies to reach states with high entropy in the future, the proposed max-min entropy framework aims to learn to visit states with low entropy and maximize the entropy of these low-entropy states to promote better exploration. For general Markov decision processes (MDPs), an efficient algorithm is constructed under the proposed max-min entropy framework based on disentanglement of exploration and exploitation. Numerical results show that the proposed algorithm yields drastic performance improvement over the current state-of-the-art RL algorithms.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] A unified framework for max-min and min-max fairness with applications
    Radunovic, Bozidar
    Le Boudec, Jean-Yves
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2007, 15 (05) : 1073 - 1083
  • [2] A max-min learning rule for Fuzzy ART
    Nong Thi Hoa
    The Duy Bui
    PROCEEDINGS OF 2013 IEEE RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES: RESEARCH, INNOVATION, AND VISION FOR THE FUTURE (RIVF), 2013, : 53 - 57
  • [3] MAX-MIN PROBLEMS
    KAPUR, KC
    NAVAL RESEARCH LOGISTICS, 1973, 20 (04) : 639 - 644
  • [4] A MAX-MIN PROBLEM
    MARSH, DCB
    AMERICAN MATHEMATICAL MONTHLY, 1967, 74 (1P1): : 86 - &
  • [5] Max-min separability
    Bagirov, AM
    OPTIMIZATION METHODS & SOFTWARE, 2005, 20 (2-3): : 271 - 290
  • [6] Effective learning in recurrent max-min neural networks
    Teow, LN
    Loe, KF
    NEURAL NETWORKS, 1998, 11 (03) : 535 - 547
  • [7] An effective learning method for max-min neural networks
    Teow, LN
    Loe, KF
    IJCAI-97 - PROCEEDINGS OF THE FIFTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 AND 2, 1997, : 1134 - 1139
  • [8] Min-max and max-min graph saturation parameters
    Sudha, S.
    Arumugam, S.
    AKCE INTERNATIONAL JOURNAL OF GRAPHS AND COMBINATORICS, 2020, 17 (03) : 943 - 947
  • [9] Max-Min Greedy Matching
    Eden, Alon
    Feige, Uriel
    Feldman, Michal
    PROCEEDINGS OF THE 14TH WORKSHOP ON THE ECONOMICS OF NETWORKS, SYSTEMS AND COMPUTATION (NETECON '19), 2019,
  • [10] MAX-MIN PURSUIT GAME
    HEYMANN, M
    PACHTER, M
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1979, 70 (02) : 430 - 444