Fast and Data Efficient Reinforcement Learning from Pixels via Non-parametric Value Approximation

被引：0

作者：

Long, Alexander ^{[1
]}

Blair, Alan ^{[1
]}

van Hoof, Herke ^{[2
]}

机构：

[1] Univ New South Wales, Sydney, NSW, Australia

[2] Univ Amsterdam, Amsterdam, Netherlands

来源：

THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present Nonparametric Approximation of Inter-Trace returns (NAIT), a Reinforcement Learning algorithm for discrete action, pixel-based environments that is both highly sample and computation efficient. NAIT is a lazy-learning approach with an update that is equivalent to episodic Monte-Carlo on episode completion, but that allows the stable incorporation of rewards while an episode is ongoing. We make use of a fixed domain-agnostic representation, simple distance based exploration and a proximity graph-based lookup to facilitate extremely fast execution. We empirically evaluate NAIT on both the 26 and 57 game variants of ATARI100k where, despite its simplicity, it achieves competitive performance in the online setting with greater than 100x speedup in wall-time.

引用

页码：7620 / 7627

页数：8

共 50 条

[21] Fast Feedforward Non-parametric Deep Learning Network with Automatic Feature Extraction
Angelov, Plamen
Gu, Xiaowei
Principe, Jose
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 534 - 541
[22] Non-parametric Learning of Stochastic Differential Equations with Non-asymptotic Fast Rates of Convergence
Bonalli, Riccardo
Rudi, Alessandro
FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2025,
[23] Non-parametric estimation of the diffusion coefficient from noisy data
Emeline Schmisser
Statistical Inference for Stochastic Processes, 2012, 15 (3) : 193 - 223
[24] Non-parametric drift estimation for diffusions from noisy data
Schmisser, Emeline
STATISTICS & RISK MODELING, 2011, 28 (02) : 119 - 150
[25] fast.adonis: a computationally efficient non-parametric multivariate analysis of microbiome data for large-scale studies
Li, Shilan
Vogtmann, Emily
Graubard, Barry, I
Gail, Mitchell H.
Abnet, Christian C.
Shi, Jianxin
Forslund, Sofia
BIOINFORMATICS ADVANCES, 2022, 2 (01):
[26] Non-parametric data selection for neural learning in non-stationary time series
Siemens AG, R and D, Otto-Hahn-Ring 6, 81739 Munich, Germany
NEURAL NETW., 3 (401-407):
[27] Non-parametric data selection for neural learning in non-stationary time series
Deco, G
Neuneier, R
Schurmann, B
NEURAL NETWORKS, 1997, 10 (03) : 401 - 407
[28] Data-efficient Non-parametric Modelling and Control of an Extensible Soft Manipulator
Kasaei, Mohammadreza
Babarahmati, Keyhan Kouhkiloui
Li, Zhibin
Khadem, Mohsen
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 2641 - 2647
[29] Data-efficient reinforcement learning by generalized value estimation
Junjie Zhou
Ying Tian
Minglun Ren
Machine Learning, 2025, 114 (6)
[30] Reinforcement Learning for Vision-based Object Manipulation with Non-parametric Policy and Action Primitives
Son, Dongwon
Kim, Myungsin
Sim, Jaecheol
Shin, Wonsik
2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 5756 - 5763

← 1 2 3 4 5 →