Modelling Policies in MDPs in Reproducing Kernel Hilbert Space

被引:0
|
作者
Lever, Guy [1 ]
Stafford, Ronnie [1 ]
机构
[1] UCL, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider modelling policies for MDPs in (vector-valued) reproducing kernel Hilbert function spaces (RKHS). This enables us to work "non-parametrically" in a rich function class, and provides the ability to learn complex policies. We present a framework for performing gradient-based policy optimization in the RKHS, deriving the functional gradient of the return for our policy, which has a simple form and can be estimated efficiently. The policy representation naturally focuses on the relevant region of state space defined by the policy trajectories, and does not rely on a-priori defined basis points; this can be an advantage in high dimensions where suitable basis points may be difficult to define a-priori. The method is adaptive in the sense that the policy representation will naturally adapt to the complexity of the policy being modelled, which is achieved with standard efficient sparsification tools in an RKHS. We argue that finding a good kernel on states can be easier then remetrizing a high dimensional feature space. We demonstrate the approach on benchmark domains and a simulated quadrocopter navigation task.
引用
收藏
页码:590 / 598
页数:9
相关论文
共 50 条
  • [1] An Example of a Reproducing Kernel Hilbert Space
    Tutaj, Edward
    [J]. COMPLEX ANALYSIS AND OPERATOR THEORY, 2019, 13 (01) : 193 - 221
  • [2] An Example of a Reproducing Kernel Hilbert Space
    Edward Tutaj
    [J]. Complex Analysis and Operator Theory, 2019, 13 : 193 - 221
  • [3] Regularization in a functional reproducing kernel Hilbert space
    Wang, Rui
    Xu, Yuesheng
    [J]. JOURNAL OF COMPLEXITY, 2021, 66
  • [4] The Henderson Smoother in Reproducing Kernel Hilbert Space
    Dagum, Estela Bee
    Bianconcini, Silvia
    [J]. JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2008, 26 (04) : 536 - 545
  • [5] A Brief Digest on Reproducing Kernel Hilbert Space
    Tong, Shou-yu
    Cong, Fu-zhong
    Wang, Zhi-xia
    [J]. INTERNATIONAL CONFERENCE ON COMPUTER, MECHATRONICS AND ELECTRONIC ENGINEERING (CMEE 2016), 2016,
  • [6] Fast quantile regression in reproducing kernel Hilbert space
    Zheng, Songfeng
    [J]. JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2022, 51 (02) : 568 - 588
  • [7] Sampling Theory in Abstract Reproducing Kernel Hilbert Space
    Yoon Mi Hong
    Jong Min Kim
    Kil H. Kwon
    [J]. Sampling Theory in Signal and Image Processing, 2007, 6 (1): : 109 - 121
  • [8] Ensemble forecasts in reproducing kernel Hilbert space family
    Dufee, Benjamin
    Hug, Berenger
    Memin, Etienne
    Tissot, Gilles
    [J]. PHYSICA D-NONLINEAR PHENOMENA, 2024, 459
  • [9] Generalized Mahalanobis depth in the reproducing kernel Hilbert space
    Hu, Yonggang
    Wang, Yong
    Wu, Yi
    Li, Qiang
    Hou, Chenping
    [J]. STATISTICAL PAPERS, 2011, 52 (03) : 511 - 522
  • [10] Local Subspace Classifier in Reproducing Kernel Hilbert Space
    Zou, DF
    [J]. ADVANCES IN MULTIMODAL INTERFACES - ICMI 2000, PROCEEDINGS, 2000, 1948 : 434 - 441