Optimal trade-off between exploration and exploitation

被引:15
|
作者
Simpkins, Alex [1 ]
de Callafon, Raymond [1 ]
Todorov, Emanuel [2 ]
机构
[1] Univ Calif San Diego, Dept Mech & Aerosp Engn, 9500 Gilman Dr, La Jolla, CA 92093 USA
[2] Univ Calif San Diego, Dept Cognit Sci, La Jolla, CA 92093 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/ACC.2008.4586462
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Control in an uncertain environment often involves a trade-off between exploratory actions, whose goal is to gather sensory information, and "regular" actions which exploit the information gathered so far and pursue the task objectives. In principle both types of action can be modeled by minimizing a single cost function within the framework of stochastic optimal control. In practice however this is difficult, because the control law must be sensitive to estimation uncertainty which violates the certainty-equivalence principle. In this paper we formalize the problem in a way which captures the essence of the exploration-exploitation trade-off and yet is amenable to numerical methods for optimal control. The key to our approach is augmenting the dynamics of the partially-observable plant with the Kalman filter dynamics, thus obtaining a higher-dimensional but fully-observable plant. The resulting control laws compare favorably to other more ad-hoc approaches. Our formalism is also suitable for modelling human behavior in tasks which benefit from active exploration.
引用
收藏
页码:33 / +
页数:2
相关论文
共 50 条
  • [31] Type-2 Fuzzy Logic Control of Trade-off between Exploration and Exploitation Properties of Genetic Algorithms
    Slowik, Adam
    [J]. SWARM AND EVOLUTIONARY COMPUTATION, 2012, 7269 : 368 - 376
  • [32] A Novel Active Learning Regression Framework for Balancing the Exploration-Exploitation Trade-Off
    Elreedy, Dina
    Atiya, Amir E.
    Shaheen, Samir, I
    [J]. ENTROPY, 2019, 21 (07)
  • [33] The effects of 24-hour sleep deprivation on the exploration-exploitation trade-off
    Glass, Brian D.
    Maddox, W. Todd
    Bowen, Christopher
    Savarie, Zachary R.
    Matthews, Michael D.
    Markman, Arthur B.
    Schnyer, David M.
    [J]. BIOLOGICAL RHYTHM RESEARCH, 2011, 42 (02) : 99 - 110
  • [34] Boredom begets creativity: A solution to the exploitation-exploration trade-off in predictive coding
    Gomez-Ramirez, Jaime
    Costa, Tommaso
    [J]. BIOSYSTEMS, 2017, 162 : 168 - 176
  • [35] Uncertainty modulated exploration in the trade-off between sensing and acting
    Sengupta, Sonal
    Medendorp, W. Pieter
    Praamstra, Peter
    Selen, Luc P. J.
    [J]. PLOS ONE, 2018, 13 (07):
  • [36] Adaptive tuning of the exploitation-exploration trade-off in four honey bee species
    Allison M. Young
    Axel Brockmann
    Fred C. Dyer
    [J]. Behavioral Ecology and Sociobiology, 2021, 75
  • [37] Adaptive tuning of the exploitation-exploration trade-off in four honey bee species
    Young, Allison M.
    Brockmann, Axel
    Dyer, Fred C.
    [J]. BEHAVIORAL ECOLOGY AND SOCIOBIOLOGY, 2021, 75 (01)
  • [38] Internal states drive nutrient homeostatis by modulating exploration-exploitation trade-off
    Corrales-Carvajal, Veronica Maria
    Faisal, Aldo A.
    Ribeiro, Carlos
    [J]. ELIFE, 2016, 5
  • [39] Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration
    Cohen, Jonathan D.
    McClure, Samuel M.
    Yu, Angela J.
    [J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2007, 362 (1481) : 933 - 942
  • [40] Optimal Health Insurance and Trade-Off between Health and Wealth
    Zhang, Yan
    Wu, Yonghong
    [J]. JOURNAL OF APPLIED MATHEMATICS, 2020, 2020