Optimal trade-off between exploration and exploitation

被引:15
|
作者
Simpkins, Alex [1 ]
de Callafon, Raymond [1 ]
Todorov, Emanuel [2 ]
机构
[1] Univ Calif San Diego, Dept Mech & Aerosp Engn, 9500 Gilman Dr, La Jolla, CA 92093 USA
[2] Univ Calif San Diego, Dept Cognit Sci, La Jolla, CA 92093 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/ACC.2008.4586462
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Control in an uncertain environment often involves a trade-off between exploratory actions, whose goal is to gather sensory information, and "regular" actions which exploit the information gathered so far and pursue the task objectives. In principle both types of action can be modeled by minimizing a single cost function within the framework of stochastic optimal control. In practice however this is difficult, because the control law must be sensitive to estimation uncertainty which violates the certainty-equivalence principle. In this paper we formalize the problem in a way which captures the essence of the exploration-exploitation trade-off and yet is amenable to numerical methods for optimal control. The key to our approach is augmenting the dynamics of the partially-observable plant with the Kalman filter dynamics, thus obtaining a higher-dimensional but fully-observable plant. The resulting control laws compare favorably to other more ad-hoc approaches. Our formalism is also suitable for modelling human behavior in tasks which benefit from active exploration.
引用
收藏
页码:33 / +
页数:2
相关论文
共 50 条
  • [1] The trade-off between knowledge exploration and exploitation in technological innovation
    Li, Dehong
    Lin, Jun
    Cui, Wentian
    Qian, Yanjun
    [J]. JOURNAL OF KNOWLEDGE MANAGEMENT, 2018, 22 (04) : 781 - 801
  • [2] Exploration and exploitation trade-off in multiagent learning
    Takadama, K
    Shimohara, K
    [J]. ICCIMA 2001: FOURTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, PROCEEDINGS, 2001, : 133 - 137
  • [3] Uncertainty quantification and exploration–exploitation trade-off in humans
    Antonio Candelieri
    Andrea Ponti
    Francesco Archetti
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 : 6843 - 6876
  • [4] Uncertainty avoidance and the exploration-exploitation trade-off
    Broekhuizen, Thijs L. J.
    Giarratana, Marco S.
    Torres, Anna
    [J]. EUROPEAN JOURNAL OF MARKETING, 2017, 51 (11-12) : 2080 - 2100
  • [5] Protection From Uncertainty in the Exploration/Exploitation Trade-Off
    Walker, Adrian R.
    Navarro, Danielle J.
    Newell, Ben R.
    Beesley, Tom
    [J]. JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION, 2022, 48 (04) : 547 - 568
  • [6] Bifurcation angles in ant foraging networks: A trade-off between exploration and exploitation?
    Berthouze, Luc
    Lorenzi, Alexander
    [J]. FROM ANIMALS TO ANIMATS 10, PROCEEDINGS, 2008, 5040 : 113 - 122
  • [7] Fuzzy Control of Trade-Off between Exploration and Exploitation Properties of Evolutionary Algorithms
    Slowik, Adam
    [J]. HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PART I, 2011, 6678 : 59 - 66
  • [8] Exploration - exploitation trade-off features a saltatory search behaviour
    Volchenkov, Dimitri
    Helbach, Jonathan
    Tscherepanow, Marko
    Kuehnel, Sina
    [J]. JOURNAL OF THE ROYAL SOCIETY INTERFACE, 2013, 10 (85)
  • [9] Uncertainty quantification and exploration-exploitation trade-off in humans
    Candelieri, Antonio
    Ponti, Andrea
    Archetti, Francesco
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (6) : 6843 - 6876
  • [10] Exploration-exploitation Trade-off in a Treasure Hunting Game
    Volchenkov, Dimitri
    Helbach, Jonathan
    Tscherepanow, Marko
    Kueheel, Sina
    [J]. ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2013, 299 : 101 - 121