Least Squares SVM for Least Squares TD Learning

被引:0
|
作者
Jung, Tobias [1 ]
Polani, Daniel [2 ]
机构
[1] Johannes Gutenberg Univ Mainz, D-6500 Mainz, Germany
[2] Univ Hertfordshire, Hatfield AL10 9AB, Herts, England
来源
ECAI 2006, PROCEEDINGS | 2006年 / 141卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible sequential nature) of training data arising in reinforcement learning we employ a subspace based variant of LS-SVM that sequentially processes the data and is hence especially suited for online learning. This approach is adapted from the context of Gaussian process regression and turns the unwieldy original optimization problem (with computational complexity being cubic in the number of processed data) into a reduced problem (with computional complexity being linear in the number of processed data). We introduce a QR decomposition based approach to solve the resulting generalized normal equations incrementally that is numerically more stable than existing recursive least squares based update algorithms. We also allow a forgetting factor in the updates to track non-stationary target functions (i.e. for the use with optimistic policy iteration). Experimental comparison with standard CMAC function approximation indicate that LS-SVMs are well-suited for online RL.
引用
收藏
页码:499 / +
页数:2
相关论文
共 50 条
  • [41] A THEOREM IN LEAST SQUARES
    Rao, C. Radrakrishna
    [J]. SANKHYA, 1951, 11 : 9 - 12
  • [42] The application of least squares
    Deming, WE
    [J]. PHILOSOPHICAL MAGAZINE, 1931, 11 (68): : 146 - 158
  • [43] Partitioned Least Squares
    Esposito, Roberto
    Cerrato, Mattia
    Locatelli, Marco
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, AI*IA 2019, 2019, 11946 : 180 - 192
  • [44] Sparseness and a Reduction from Totally Nonnegative Least Squares to SVM
    Potluru, Vamsi K.
    Plis, Sergey M.
    Luan, Shuang
    Calhoun, Vince D.
    Hayes, Thomas P.
    [J]. 2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 1922 - 1929
  • [45] GAMMA BY LEAST SQUARES
    White, D. R.
    [J]. JOURNAL OF THE SOCIETY OF MOTION PICTURE ENGINEERS, 1932, 18 (05): : 584 - 592
  • [46] Partial least squares
    Vinzi, VE
    Lauro, C
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2005, 48 (01) : 1 - 4
  • [47] ON GENERALIZED LEAST SQUARES
    FISHER, G
    [J]. ECONOMETRICA, 1966, 34 (5S) : 77 - &
  • [48] Partitioned least squares
    Esposito, Roberto
    Cerrato, Mattia
    Locatelli, Marco
    [J]. MACHINE LEARNING, 2024, 113 (09) : 6839 - 6869
  • [49] OPTIMIZATION BY LEAST SQUARES
    MORRISON, DD
    [J]. SIAM JOURNAL ON NUMERICAL ANALYSIS, 1968, 5 (01) : 83 - &
  • [50] Beyond Least Squares
    Dose, V.
    von Toussaint, U.
    [J]. BAYESIAN INFERENCE AND MAXIMUM ENTROPY METHODS IN SCIENCE AND ENGINEERING, 2013, 1553 : 92 - 105