Least Squares SVM for Least Squares TD Learning

被引:0
|
作者
Jung, Tobias [1 ]
Polani, Daniel [2 ]
机构
[1] Johannes Gutenberg Univ Mainz, D-6500 Mainz, Germany
[2] Univ Hertfordshire, Hatfield AL10 9AB, Herts, England
来源
ECAI 2006, PROCEEDINGS | 2006年 / 141卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible sequential nature) of training data arising in reinforcement learning we employ a subspace based variant of LS-SVM that sequentially processes the data and is hence especially suited for online learning. This approach is adapted from the context of Gaussian process regression and turns the unwieldy original optimization problem (with computational complexity being cubic in the number of processed data) into a reduced problem (with computional complexity being linear in the number of processed data). We introduce a QR decomposition based approach to solve the resulting generalized normal equations incrementally that is numerically more stable than existing recursive least squares based update algorithms. We also allow a forgetting factor in the updates to track non-stationary target functions (i.e. for the use with optimistic policy iteration). Experimental comparison with standard CMAC function approximation indicate that LS-SVMs are well-suited for online RL.
引用
收藏
页码:499 / +
页数:2
相关论文
共 50 条
  • [1] Budget Online Learning Algorithm for Least Squares SVM
    Jian, Ling
    Shen, Shuqian
    Li, Jundong
    Liang, Xijun
    Li, Lei
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (09) : 2076 - 2087
  • [2] Unifying least squares, total least squares and data least squares
    Paige, CC
    Strakos, Z
    [J]. TOTAL LEAST SQUARES AND ERRORS-IN-VARIABLES MODELING: ANALYSIS, ALGORITHMS AND APPLICATIONS, 2002, : 25 - 34
  • [3] SMO algorithm for least squares SVM
    Keerthi, SS
    Shevade, SK
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2088 - 2093
  • [4] WHEN LEAST-SQUARES SQUARES LEAST
    ALCHALABI, M
    [J]. GEOPHYSICAL PROSPECTING, 1992, 40 (03) : 359 - 378
  • [5] Least Squares Learning Identification
    Sun Mingxuan
    Bi Hongbo
    [J]. 2011 30TH CHINESE CONTROL CONFERENCE (CCC), 2011, : 1615 - 1620
  • [6] SVM vs regularized least squares classification
    Zhang, P
    Peng, J
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, 2004, : 176 - 179
  • [7] A fast training algorithm for least squares SVM
    Jiang, Shouda
    Lin, Lianlei
    Sun, Chao
    [J]. 2007 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, VOL II, PROCEEDINGS, 2007, : 586 - 589
  • [8] Multi resolution least squares SVM solver
    Schouten, T
    Suykens, JAK
    De Moor, B
    [J]. PROCEEDINGS OF THE 43RD IEEE MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS I-III, 2000, : 1178 - 1181
  • [9] Least Squares Estimates and the Coverage of Least Squares Costs
    Care, Algo
    Garatti, Simone
    Campi, Marco C.
    [J]. 2013 IEEE 52ND ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2013, : 6025 - 6030
  • [10] ON THE ACCURACY OF THE LEAST SQUARES AND THE TOTAL LEAST SQUARES METHODS
    魏木生
    George Majda
    [J]. Numerical Mathematics(Theory,Methods and Applications), 1994, (02) : 135 - 153