Hybrid MDP based integrated hierarchical Q-learning

被引:0
|
作者
ChunLin Chen
DaoYi Dong
Han-Xiong Li
Tzyh-Jong Tarn
机构
[1] Nanjing University,Department of Control and System Engineering, and State Key Laboratory for Novel Software Technology
[2] Zhejiang University,Institute of Cyber
[3] University of New South Wales at the Australian Defence Force Academy,Systems and Control, State Key Laboratory of Industrial Control Technology
[4] City University of Hong Kong,School of Engineering and Information Technology
[5] Washington University in St. Louis,Department of Manufacturing Engineering and Engineering Management
来源
关键词
reinforcement learning; hierarchical Q-learning; hybrid MDP; temporal abstraction;
D O I
暂无
中图分类号
学科分类号
摘要
As a widely used reinforcement learning method, Q-learning is bedeviled by the curse of dimensionality: The computational complexity grows dramatically with the size of state-action space. To combat this difficulty, an integrated hierarchical Q-learning framework is proposed based on the hybrid Markov decision process (MDP) using temporal abstraction instead of the simple MDP. The learning process is naturally organized into multiple levels of learning, e.g., quantitative (lower) level and qualitative (upper) level, which are modeled as MDP and semi-MDP (SMDP), respectively. This hierarchical control architecture constitutes a hybrid MDP as the model of hierarchical Q-learning, which bridges the two levels of learning. The proposed hierarchical Q-learning can scale up very well and speed up learning with the upper level learning process. Hence this approach is an effective integral learning and control scheme for complex problems. Several experiments are carried out using a puzzle problem in a gridworld environment and a navigation control problem for a mobile robot. The experimental results demonstrate the effectiveness and efficiency of the proposed approach.
引用
收藏
页码:2279 / 2294
页数:15
相关论文
共 50 条
  • [1] Hybrid MDP based integrated hierarchical Q-learning
    TARN Tzyh-Jong
    Science China(Information Sciences), 2011, 54 (11) : 2279 - 2294
  • [2] Hybrid MDP based integrated hierarchical Q-learning
    Chen ChunLin
    Dong DaoYi
    Li Han-Xiong
    Tarn, Tzyh-Jong
    SCIENCE CHINA-INFORMATION SCIENCES, 2011, 54 (11) : 2279 - 2294
  • [3] Q-learning based on hierarchical evolutionary mechanism
    Department of Information Engineering, Meijo University, 1-501, Tenpaku, Nagoya, Aichi, 468-8502, Japan
    不详
    WSEAS Trans. Syst. Control, 2008, 3 (219-228):
  • [4] Hybrid control for robot navigation - A hierarchical Q-learning algorithm
    Chen, Chunlin
    Li, Han-Xiong
    Dong, Daoyi
    IEEE ROBOTICS & AUTOMATION MAGAZINE, 2008, 15 (02) : 37 - 47
  • [5] Swarm Reinforcement Learning Method Based on Hierarchical Q-Learning
    Kuroe, Yasuaki
    Takeuchi, Kenya
    Maeda, Yutaka
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [6] Hierarchical clustering with deep Q-learning
    Forster, Richard
    Fulop, Agnes
    ACTA UNIVERSITATIS SAPIENTIAE INFORMATICA, 2018, 10 (01) : 86 - 109
  • [7] A Hybrid Web Recommender System Based on Q-Learning
    Taghipour, Nima
    Kardan, Ahmad
    APPLIED COMPUTING 2008, VOLS 1-3, 2008, : 1164 - 1168
  • [8] Hierarchical model predictive control strategy based on Q-Learning algorithm for hybrid electric vehicle platoon
    Yin, Yanli
    Huang, Xuejiang
    Zhan, Sen
    Zhang, Xinxin
    Wang, Fuzhen
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART D-JOURNAL OF AUTOMOBILE ENGINEERING, 2024, 238 (2-3) : 385 - 402
  • [9] Nested Q-learning of hierarchical control structures
    Digney, BL
    ICNN - 1996 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS. 1-4, 1996, : 161 - 166
  • [10] Nested Q-learning of hierarchical control structures
    Digney, BL
    ICNN - 1996 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS. 1-4, 1996, : 1676 - 1681