A Monte-Carlo AIXI Approximation

被引：67

作者：

Veness, Joel ^{[1
]}

Kee Siong Ng ^{[2
]}

Hutter, Marcus ^{[2
]}

Uther, William ^{[1
]}

Silver, David ^{[3
]}

机构：

[1] Univ New S Wales, Sydney, NSW 2052, Australia

[2] Australian Natl Univ, Canberra, ACT 0200, Australia

[3] MIT, Cambridge, MA 02139 USA

来源：

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH | 2011年 / 40卷

基金：

澳大利亚研究理事会;

关键词：

TREE WEIGHTING METHOD; UNIVERSAL; ALGORITHM;

D O I：

10.1613/jair.3125

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. Our approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a new Monte-Carlo Tree Search algorithm along with an agent-specific extension to the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a variety of stochastic and partially observable domains. We conclude by proposing a number of directions for future research.

引用

页码：95 / 142

页数：48

共 50 条

[1] MONTE-CARLO APPROXIMATION AND THE ITERATED BOOTSTRAP
BOOTH, JG
HALL, P
[J]. BIOMETRIKA, 1994, 81 (02) : 331 - 340
[2] PROJECTOR APPROXIMATION AND QUANTUM MONTE-CARLO
FYE, RM
[J]. INTERNATIONAL JOURNAL OF MODERN PHYSICS C-PHYSICS AND COMPUTERS, 1994, 5 (03): : 483 - 488
[3] MONTE-CARLO APPROXIMATION ALGORITHMS FOR ENUMERATION PROBLEMS
KARP, RM
LUBY, M
MADRAS, N
[J]. JOURNAL OF ALGORITHMS, 1989, 10 (03) : 429 - 448
[4] Monte-Carlo approximation of minimum entropy measures
Jourdain, B
Nguyen, L
[J]. COMPTES RENDUS DE L ACADEMIE DES SCIENCES SERIE I-MATHEMATIQUE, 2001, 332 (04): : 345 - 350
[5] Fast Monte-Carlo Approximation of the Attention Mechanism
Kim, Hyunjun
Ko, JeongGil
[J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7185 - 7193
[6] MONTE-CARLO
TEMPLETONCOTILL, J
[J]. CONNOISSEUR, 1977, 194 (781): : 215 - 216
[7] PIECEWISE CONSTANT APPROXIMATION FOR MONTE-CARLO CALCULATION OF WIENER INTEGRALS
VENTTSEL, AD
GLADYSHEV, SA
MILSHTEYN, GN
[J]. THEORY OF PROBABILITY AND ITS APPLICATIONS, 1985, 29 (04) : 744 - 752
[8] A QUASI-DETERMINISTIC APPROXIMATION OF THE MONTE-CARLO IMPORTANCE FUNCTION
BOOTH, TE
[J]. NUCLEAR SCIENCE AND ENGINEERING, 1990, 104 (04) : 374 - 384
[9] Monte-Carlo approximation for probability distribution of monotone Boolean function
Andronov, A
[J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2005, 132 (1-2) : 21 - 31
[10] A MONTE-CARLO APPROXIMATION OF THE DISTRIBUTIONS OF THE MAXIMUM OF VARIOUS BROWNIAN BRIDGES
KULPERGER, RJ
[J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1988, 17 (12) : 4389 - 4397

← 1 2 3 4 5 →