Fast and Data Efficient Reinforcement Learning from Pixels via Non-parametric Value Approximation

被引:0
|
作者
Long, Alexander [1 ]
Blair, Alan [1 ]
van Hoof, Herke [2 ]
机构
[1] Univ New South Wales, Sydney, NSW, Australia
[2] Univ Amsterdam, Amsterdam, Netherlands
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present Nonparametric Approximation of Inter-Trace returns (NAIT), a Reinforcement Learning algorithm for discrete action, pixel-based environments that is both highly sample and computation efficient. NAIT is a lazy-learning approach with an update that is equivalent to episodic Monte-Carlo on episode completion, but that allows the stable incorporation of rewards while an episode is ongoing. We make use of a fixed domain-agnostic representation, simple distance based exploration and a proximity graph-based lookup to facilitate extremely fast execution. We empirically evaluate NAIT on both the 26 and 57 game variants of ATARI100k where, despite its simplicity, it achieves competitive performance in the online setting with greater than 100x speedup in wall-time.
引用
收藏
页码:7620 / 7627
页数:8
相关论文
共 50 条
  • [21] Fast Feedforward Non-parametric Deep Learning Network with Automatic Feature Extraction
    Angelov, Plamen
    Gu, Xiaowei
    Principe, Jose
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 534 - 541
  • [22] Non-parametric Learning of Stochastic Differential Equations with Non-asymptotic Fast Rates of Convergence
    Bonalli, Riccardo
    Rudi, Alessandro
    FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2025,
  • [23] Non-parametric estimation of the diffusion coefficient from noisy data
    Emeline Schmisser
    Statistical Inference for Stochastic Processes, 2012, 15 (3) : 193 - 223
  • [24] Non-parametric drift estimation for diffusions from noisy data
    Schmisser, Emeline
    STATISTICS & RISK MODELING, 2011, 28 (02) : 119 - 150
  • [25] fast.adonis: a computationally efficient non-parametric multivariate analysis of microbiome data for large-scale studies
    Li, Shilan
    Vogtmann, Emily
    Graubard, Barry, I
    Gail, Mitchell H.
    Abnet, Christian C.
    Shi, Jianxin
    Forslund, Sofia
    BIOINFORMATICS ADVANCES, 2022, 2 (01):
  • [26] Non-parametric data selection for neural learning in non-stationary time series
    Siemens AG, R and D, Otto-Hahn-Ring 6, 81739 Munich, Germany
    NEURAL NETW., 3 (401-407):
  • [27] Non-parametric data selection for neural learning in non-stationary time series
    Deco, G
    Neuneier, R
    Schurmann, B
    NEURAL NETWORKS, 1997, 10 (03) : 401 - 407
  • [28] Data-efficient Non-parametric Modelling and Control of an Extensible Soft Manipulator
    Kasaei, Mohammadreza
    Babarahmati, Keyhan Kouhkiloui
    Li, Zhibin
    Khadem, Mohsen
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 2641 - 2647
  • [29] Data-efficient reinforcement learning by generalized value estimation
    Junjie Zhou
    Ying Tian
    Minglun Ren
    Machine Learning, 2025, 114 (6)
  • [30] Reinforcement Learning for Vision-based Object Manipulation with Non-parametric Policy and Action Primitives
    Son, Dongwon
    Kim, Myungsin
    Sim, Jaecheol
    Shin, Wonsik
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 5756 - 5763