KERNEL-BASED LIFELONG POLICY GRADIENT REINFORCEMENT LEARNING

被引:1
|
作者
Mowakeaa, Rami [1 ]
Kim, Seung-Jun [1 ]
Emge, Darren K. [2 ]
机构
[1] Univ Maryland Baltimore Cty, Dept Comp Sci & Elect Engn, Baltimore, MD 21250 USA
[2] Chem Biol Ctr RDCB DRC P, Combat Capabil Dev Command, Gunpowder, MD USA
基金
美国国家科学基金会;
关键词
Reinforcement learning; lifelong learning; kernel methods; policy gradients; dictionary learning;
D O I
10.1109/ICASSP39728.2021.9414511
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Policy gradient methods have been widely used in reinforcement learning (RL), especially thanks to their facility to handle continuous state spaces, strong convergence guarantees, and low-complexity updates. Training of the methods for individual tasks, however, can still be taxing in terms of the learning speed and the sample trajectory collection. Lifelong learning aims to exploit the intrinsic structure shared among a suite of RL tasks, akin to multitask learning, but in an efficient online fashion. In this work, we propose a lifelong RL algorithm based on the kernel method to leverage nonlinear features of the data based on a popular union-of-subspace model. Experimental results on a set of simple related tasks verify the advantage of the proposed strategy, compared to the single-task and the parametric counterparts.
引用
收藏
页码:3500 / 3504
页数:5
相关论文
共 50 条
  • [31] Efficient kernel-based learning for trees
    Aiolli, Fabio
    Martino, Giovanni Da San
    Sperduti, Alessandro
    Moschitti, Alessandro
    [J]. 2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, VOLS 1 AND 2, 2007, : 308 - 315
  • [32] KeLP: A kernel-based learning platform
    Filice, Simone
    Castellucci, Giuseppe
    Martino, Giovanni Da San
    Moschitti, Alessandro
    Croce, Danilo
    Basili, Roberto
    [J]. Journal of Machine Learning Research, 2018, 18 : 1 - 5
  • [33] Biological plausibility of kernel-based learning
    Kristen Fortney
    Douglas Tweed
    [J]. BMC Neuroscience, 8 (Suppl 2)
  • [34] A sparse kernel-based least-squares temporal difference algorithm for reinforcement learning
    Xu, Xin
    [J]. ADVANCES IN NATURAL COMPUTATION, PT 1, 2006, 4221 : 47 - 56
  • [35] An introduction to kernel-based learning algorithms
    Müller, KR
    Mika, S
    Rätsch, G
    Tsuda, K
    Schölkopf, B
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2001, 12 (02): : 181 - 201
  • [36] Kernel-based learning of orthogonal functions
    Scampicchio, Anna
    Bisiacco, Mauro
    Pillonetto, Gianluigi
    [J]. NEUROCOMPUTING, 2023, 545
  • [37] Bounded Kernel-Based Online Learning
    Orabona, Francesco
    Keshet, Joseph
    Caputo, Barbara
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2009, 10 : 2643 - 2666
  • [38] Coupled kernel-based subspace learning
    Yan, SC
    Xu, D
    Zhang, L
    Zhang, BY
    Zhang, HJ
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 645 - 650
  • [39] Policy gradient fuzzy reinforcement learning
    Wang, XN
    Xu, X
    He, HG
    [J]. PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 992 - 995
  • [40] Reinforcement Learning based on MPC and the Stochastic Policy Gradient Method
    Gros, Sebastien
    Zanon, Mario
    [J]. 2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1947 - 1952