Hybrid MDP based integrated hierarchical Q-learning

被引:0
|
作者
ChunLin Chen
DaoYi Dong
Han-Xiong Li
Tzyh-Jong Tarn
机构
[1] Nanjing University,Department of Control and System Engineering, and State Key Laboratory for Novel Software Technology
[2] Zhejiang University,Institute of Cyber
[3] University of New South Wales at the Australian Defence Force Academy,Systems and Control, State Key Laboratory of Industrial Control Technology
[4] City University of Hong Kong,School of Engineering and Information Technology
[5] Washington University in St. Louis,Department of Manufacturing Engineering and Engineering Management
来源
Science China Information Sciences | 2011年 / 54卷
关键词
reinforcement learning; hierarchical Q-learning; hybrid MDP; temporal abstraction;
D O I
暂无
中图分类号
学科分类号
摘要
As a widely used reinforcement learning method, Q-learning is bedeviled by the curse of dimensionality: The computational complexity grows dramatically with the size of state-action space. To combat this difficulty, an integrated hierarchical Q-learning framework is proposed based on the hybrid Markov decision process (MDP) using temporal abstraction instead of the simple MDP. The learning process is naturally organized into multiple levels of learning, e.g., quantitative (lower) level and qualitative (upper) level, which are modeled as MDP and semi-MDP (SMDP), respectively. This hierarchical control architecture constitutes a hybrid MDP as the model of hierarchical Q-learning, which bridges the two levels of learning. The proposed hierarchical Q-learning can scale up very well and speed up learning with the upper level learning process. Hence this approach is an effective integral learning and control scheme for complex problems. Several experiments are carried out using a puzzle problem in a gridworld environment and a navigation control problem for a mobile robot. The experimental results demonstrate the effectiveness and efficiency of the proposed approach.
引用
收藏
页码:2279 / 2294
页数:15
相关论文
共 50 条
  • [41] Integrated Online Q-Learning Design for Wastewater Treatment Processes
    Zhao, Mingming
    Wang, Ding
    Ren, Jin
    Qiao, Junfei
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2025, 21 (02) : 1833 - 1842
  • [42] HiQ: A hierarchical Q-learning algorithm to solve the reader collision problem
    Ho, J
    Engels, DW
    Sarma, SE
    INTERNATIONAL SYMPOSIUM ON APPLICATIONS AND THE INTERNET WORKSHOPS, PROCEEDINGS, 2006, : 88 - 91
  • [43] Hierarchical control of traffic signals using Q-learning with tile coding
    Monireh Abdoos
    Nasser Mozayani
    Ana L. C. Bazzan
    Applied Intelligence, 2014, 40 : 201 - 213
  • [44] Hierarchical control of traffic signals using Q-learning with tile coding
    Abdoos, Monireh
    Mozayani, Nasser
    Bazzan, Ana L. C.
    APPLIED INTELLIGENCE, 2014, 40 (02) : 201 - 213
  • [45] Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning
    Ohnishi, Shota
    Uchibe, Eiji
    Yamaguchi, Yotaro
    Nakanishi, Kosuke
    Yasui, Yuji
    Ishii, Shin
    FRONTIERS IN NEUROROBOTICS, 2019, 13
  • [46] Q-Learning Approach for Hierarchical AGC Scheme of Interconnected Power Grids
    Zhou, B.
    Chan, K. W.
    Yu, T.
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON SMART GRID AND CLEAN ENERGY TECHNOLOGIES (ICSGCE 2011), 2011, 12
  • [47] Regenerative Braking Algorithm for Parallel Hydraulic Hybrid Vehicles Based on Fuzzy Q-Learning
    Ning, Xiaobin
    Wang, Jiazheng
    Yin, Yuming
    Shangguan, Jiarong
    Bao, Nanxin
    Li, Ning
    ENERGIES, 2023, 16 (04)
  • [48] Hybrid Directional CR-MAC based on Q-Learning with Directional Power Control
    Carie, Anil
    Li, Mingchu
    Liu, Chang
    Reddy, Prakasha
    Jamal, Waseef
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 81 : 340 - 347
  • [49] Q-Learning Based Routing in Optical Networks
    Bryant, Nolen B.
    Chung, Kwok K.
    Feng, Jie
    Harris, Sommer
    Umeh, Kristine N.
    Aibin, Michal
    2022 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2022, : 419 - 422
  • [50] Learning rates for Q-Learning
    Even-Dar, E
    Mansour, Y
    COMPUTATIONAL LEARNING THEORY, PROCEEDINGS, 2001, 2111 : 589 - 604