A hybrid transfer algorithm for reinforcement learning based on spectral method

被引:0
|
作者
机构
[1] Zhu, Mei-Qiang
[2] Cheng, Yu-Hu
[3] Li, Ming
[4] Wang, Xue-Song
[5] Feng, Huan-Ting
来源
Zhu, M.-Q. (zhumeiqiang@cumt.edu.cn) | 1765年 / Science Press卷 / 38期
关键词
Fiedler eigenvector - Hierarchical control structure - Hierarchical decompositions - Laplacian eigenmap - Number of iterations - Proto-Value Functions - Spectral graph theory - Spectral methods;
D O I
10.3724/SP.J.1004.2012.01765
中图分类号
学科分类号
摘要
For scaling up state space transfer underlying the proto-value function framework, only some basis functions corresponding to smaller eigenvalues are transferred effectively, which will result in wrong approximation of value function in the target task. In order to solve the problem, according to the fact that Laplacian eigenmap can preserve the local topology structure of state space, an improved hierarchical decomposition algorithm based on the spectral graph theory is proposed and a hybrid transfer method integrating basis function transfer with subtask optimal polices transfer is designed. At first, the basis functions of the source task are constructed using spectral method. The basis functions of target task are produced through linearly interpolating basis functions of the source task. Secondly, the produced second basis function of the target task (approximating Fiedler eigenvector) is used to decompose the target task. Then the optimal polices of subtasks are obtained using the improved hierarchical decomposition algorithm. At last, the obtained basis functions and optimal subtask polices are transferred to the target task. The proposed hybrid transfer method can directly get optimal policies of some states, reduce the number of iterations and the minimum number of basis functions needed to approximate the value function. The method is suitable for scaling up state space transfer task with hierarchical control structure. Simulation results of grid world have verified the validity of the proposed hybrid transfer method. © 2012 Acta Automatica Sinica.
引用
收藏
相关论文
共 50 条
  • [41] A Hybrid Recommendation for Music Based on Reinforcement Learning
    Wang, Yu
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT I, 2020, 12084 : 91 - 103
  • [42] Reinforcement Learning Algorithm Based on Immune Tolerance
    Wang Lei
    Lin Ye
    Hei Xinhong
    Wang Xiaofan
    2013 32ND CHINESE CONTROL CONFERENCE (CCC), 2013, : 3314 - 3320
  • [43] Reinforcement learning algorithm based on information entropy
    Zhao Y.
    Chen Q.-W.
    Hu W.-L.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2010, 32 (05): : 1043 - 1046
  • [44] Structure and algorithm of reinforcement learning based controller
    Moshi Shibie yu Rengong Zhineng, 1 (96-100):
  • [45] RLGA: A reinforcement learning based genetic algorithm
    Wang, Ben-Nian
    Gao, Yang
    Chen, Zhao-Qian
    Xie, Jun-Yuan
    Chen, Shi-Fu
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2006, 34 (05): : 856 - 860
  • [46] Intelligent call transfer based on reinforcement learning
    Jevtic, D
    Sablic, D
    IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL VI, 2000, : 120 - 123
  • [47] Exercise Recommendation Algorithm Based on Reinforcement Learning
    Yu, Simiao
    Li, Ji
    Zhang, Tiancheng
    ENGINEERING LETTERS, 2024, 32 (10) : 1947 - 1956
  • [48] Transfer and Reinforcement Learning Based Production Control
    Steinbacher L.
    Pering E.
    Freitag M.
    ZWF Zeitschrift fuer Wirtschaftlichen Fabrikbetrieb, 2022, 117 (09): : 609 - 613
  • [49] A deep reinforcement learning based hybrid algorithm for efficient resource scheduling in edge computing environment
    Xue, Fei
    Hai, Qiuru
    Dong, Tingting
    Cui, Zhihua
    Gong, Yuelu
    INFORMATION SCIENCES, 2022, 608 : 362 - 374
  • [50] Intelligent Logistics Resource Scheduling Based on Hybrid Parameter Ant Colony Algorithm and Reinforcement Learning
    Li, Wanchun
    Wang, Liping
    Informatica (Slovenia), 2025, 49 (13): : 33 - 48