How to Explore to Maximize Future Return

被引:0
|
作者
Szepesvari, Csaba [1 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With access to huge-scale distributed systems and more data than ever before, learning systems that learn to make good predictions break yesterday's records on a daily basis. Although prediction problems are important, predicting what to do has its own challenges, which calls for specialized solution methods. In this talk, by means of some examples based on recent work on reinforcement learning, I will illustrate the unique opportunities and challenges that arise when a system must learn to make good decisions to maximize long-term return. In particular, I will start by demonstrating that passive data collection inevitably leads to catastrophic data sparsity in sequential decision making problems (no amount of data is big enough!), while clever algorithms, tailored to this setting, can escape data sparsity, learning essentially arbitrarily faster than what is possible under passive data collection. I will also describe current attempts to scale up such clever algorithms to work on large-scale problems. Amongst the possible approaches, I will discuss the role of sparsity to address this challenge in the practical, yet mathematically elegant setting of "linear bandits". Interestingly, while in the related linear prediction problem, sparsity allows one to deal with huge dimensionality in a seamless fashion, the status of this question in the bandit setting is much less understood.
引用
收藏
页数:1
相关论文
共 50 条
  • [41] Estimation of production technology when the objective is to maximize return to the outlay
    Kumbhakar, Subal C.
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2011, 208 (02) : 170 - 176
  • [42] MAXIMIZE THE RETURN ON YOUR TRAINING INVESTMENT THROUGH NEEDS ANALYSIS
    GEORGENSON, D
    DELGAIZO, E
    [J]. TRAINING AND DEVELOPMENT JOURNAL, 1984, 38 (08): : 42 - 47
  • [43] Using return on investment to maximize conservation effectiveness in Argentine grasslands
    Murdoch, William
    Ranganathan, Jai
    Polasky, Stephen
    Regetz, James
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2010, 107 (49) : 20855 - 20862
  • [44] Work Smarter: Six Steps to Maximize Your Return on Investment
    Hess, Cathy Thomas
    [J]. ADVANCES IN SKIN & WOUND CARE, 2019, 32 (10) : 479 - 480
  • [45] Advanced software tools help maximize your return on assets
    Mueller, T
    [J]. CONTROL SOLUTIONS, 2001, 74 (12): : 30 - 34
  • [46] How to explore planets with drones
    Macdonald, Sofie
    Stevens, Adam
    [J]. ASTRONOMY & GEOPHYSICS, 2018, 59 (03) : 18 - 22
  • [47] HOW TO EXPLORE ACROSOME FUNCTION
    FENICHEL, P
    [J]. CONTRACEPTION FERTILITE SEXUALITE, 1990, 18 (7-8): : 543 - 545
  • [48] HOW WE EXPLORE THE UNIVERSE
    HONAN, P
    [J]. PERSONAL COMPUTING, 1989, 13 (10): : 177 - &
  • [49] What is Populism and How to Explore It?
    Salaj, Berto
    Grbesa, Marijana
    [J]. DRUSTVENA ISTRAZIVANJA, 2017, 26 (03): : 321 - 340
  • [50] HOW TO EXPLORE THE PATCH SPACE
    Lisani, Jose-Luis
    Buades, Antoni
    Morel, Jean-Michel
    [J]. INVERSE PROBLEMS AND IMAGING, 2013, 7 (03) : 813 - 838