How to Explore to Maximize Future Return

被引：0

作者：

Szepesvari, Csaba ^{[1
]}

机构：

[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada

来源：

ADVANCES IN ARTIFICIAL INTELLIGENCE (AI 2015) | 2015年 / 9091卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With access to huge-scale distributed systems and more data than ever before, learning systems that learn to make good predictions break yesterday's records on a daily basis. Although prediction problems are important, predicting what to do has its own challenges, which calls for specialized solution methods. In this talk, by means of some examples based on recent work on reinforcement learning, I will illustrate the unique opportunities and challenges that arise when a system must learn to make good decisions to maximize long-term return. In particular, I will start by demonstrating that passive data collection inevitably leads to catastrophic data sparsity in sequential decision making problems (no amount of data is big enough!), while clever algorithms, tailored to this setting, can escape data sparsity, learning essentially arbitrarily faster than what is possible under passive data collection. I will also describe current attempts to scale up such clever algorithms to work on large-scale problems. Amongst the possible approaches, I will discuss the role of sparsity to address this challenge in the practical, yet mathematically elegant setting of "linear bandits". Interestingly, while in the related linear prediction problem, sparsity allows one to deal with huge dimensionality in a seamless fashion, the status of this question in the bandit setting is much less understood.

引用

页数：1

共 50 条

[41] Estimation of production technology when the objective is to maximize return to the outlay
Kumbhakar, Subal C.
[J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2011, 208 (02) : 170 - 176
[42] MAXIMIZE THE RETURN ON YOUR TRAINING INVESTMENT THROUGH NEEDS ANALYSIS
GEORGENSON, D
DELGAIZO, E
[J]. TRAINING AND DEVELOPMENT JOURNAL, 1984, 38 (08): : 42 - 47
[43] Using return on investment to maximize conservation effectiveness in Argentine grasslands
Murdoch, William
Ranganathan, Jai
Polasky, Stephen
Regetz, James
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2010, 107 (49) : 20855 - 20862
[44] Work Smarter: Six Steps to Maximize Your Return on Investment
Hess, Cathy Thomas
[J]. ADVANCES IN SKIN & WOUND CARE, 2019, 32 (10) : 479 - 480
[45] Advanced software tools help maximize your return on assets
Mueller, T
[J]. CONTROL SOLUTIONS, 2001, 74 (12): : 30 - 34
[46] How to explore planets with drones
Macdonald, Sofie
Stevens, Adam
[J]. ASTRONOMY & GEOPHYSICS, 2018, 59 (03) : 18 - 22
[47] HOW TO EXPLORE ACROSOME FUNCTION
FENICHEL, P
[J]. CONTRACEPTION FERTILITE SEXUALITE, 1990, 18 (7-8): : 543 - 545
[48] HOW WE EXPLORE THE UNIVERSE
HONAN, P
[J]. PERSONAL COMPUTING, 1989, 13 (10): : 177 - &
[49] What is Populism and How to Explore It?
Salaj, Berto
Grbesa, Marijana
[J]. DRUSTVENA ISTRAZIVANJA, 2017, 26 (03): : 321 - 340
[50] HOW TO EXPLORE THE PATCH SPACE
Lisani, Jose-Luis
Buades, Antoni
Morel, Jean-Michel
[J]. INVERSE PROBLEMS AND IMAGING, 2013, 7 (03) : 813 - 838

← 1 2 3 4 5 →