How to Explore to Maximize Future Return

被引：0

作者：

Szepesvari, Csaba ^{[1
]}

机构：

[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada

来源：

ADVANCES IN ARTIFICIAL INTELLIGENCE (AI 2015) | 2015年 / 9091卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With access to huge-scale distributed systems and more data than ever before, learning systems that learn to make good predictions break yesterday's records on a daily basis. Although prediction problems are important, predicting what to do has its own challenges, which calls for specialized solution methods. In this talk, by means of some examples based on recent work on reinforcement learning, I will illustrate the unique opportunities and challenges that arise when a system must learn to make good decisions to maximize long-term return. In particular, I will start by demonstrating that passive data collection inevitably leads to catastrophic data sparsity in sequential decision making problems (no amount of data is big enough!), while clever algorithms, tailored to this setting, can escape data sparsity, learning essentially arbitrarily faster than what is possible under passive data collection. I will also describe current attempts to scale up such clever algorithms to work on large-scale problems. Amongst the possible approaches, I will discuss the role of sparsity to address this challenge in the practical, yet mathematically elegant setting of "linear bandits". Interestingly, while in the related linear prediction problem, sparsity allows one to deal with huge dimensionality in a seamless fashion, the status of this question in the bandit setting is much less understood.

引用

页数：1

共 50 条

[1] INVESTMENT IN QUALITY IMPROVEMENT: HOW TO MAXIMIZE THE RETURN
Gandjour, Afschin
[J]. HEALTH ECONOMICS, 2010, 19 (01) : 31 - 42
[2] Brochure includes advice on how to maximize return on ISO 9000
不详
[J]. QUALITY PROGRESS, 1998, 31 (06) : 16 - 16
[3] How Should One Explore the Digital Library of the Future?
Fox, Edward A.
Chandrasekar, Prashant
[J]. Data and Information Management, 2021, 5 (04) : 349 - 362
[4] First return, then explore
Ecoffet, Adrien
Huizinga, Joost
Lehman, Joel
Stanley, Kenneth O.
Clune, Jeff
[J]. NATURE, 2021, 590 (7847) : 580 - 586
[5] First return, then explore
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
[J]. Nature, 2021, 590 : 580 - 586
[6] Refineries Explore IloT Tools to Maximize Profits
Jenkins, Scott
[J]. CHEMICAL ENGINEERING, 2017, 124 (05) : 16 - 20
[7] Maximize the return on temp staff investments
Davidson, L
[J]. WORKFORCE, 1999, 78 (11): : 58 - 60
[8] MAXIMIZE RETURN ON MOST LIMITED RESOURCES
HOLTSBERRY, A
[J]. JOURNAL OF SYSTEMS MANAGEMENT, 1985, 36 (12): : 14 - 16
[9] Remote makeovers maximize return on spacecraft science
Feder, Toni
[J]. PHYSICS TODAY, 2015, 68 (03) : 19 - 21
[10] BEYOND THE RETURN ON ADVERTISING: ELASTICITY OF THE RETURN ON ADVERTISING AS A DIAGNOSTIC METRIC TO MAXIMIZE PROFIT
Mitchell, Ted
Makienko, Igor
[J]. IDEAS IN MARKETING: FINDING THE NEW AND POLISHING THE OLD, 2015, : 463 - 463

← 1 2 3 4 5 →