The Differentiable Cross-Entropy Method

被引:0
|
作者
Amos, Brandon [1 ]
Yarats, Denis [1 ,2 ]
机构
[1] Facebook AI Res, Menlo Pk, CA 94025 USA
[2] NYU, New York, NY 10003 USA
关键词
OPTIMIZATION; EQUATIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the cross-entropy method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant that enables us to differentiate the output of CEM with respect to the objective function's parameters. In the machine learning setting this brings CEM inside of the end-to-end learning pipeline where this has otherwise been impossible. We show applications in a synthetic energy-based structured prediction task and in non-convex continuous control. In the control setting we show how to embed optimal action sequences into a lower-dimensional space. DCEM enables us to fine-tune CEM-based controllers with policy optimization.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] A tutorial on the cross-entropy method
    De Boer, PT
    Kroese, DP
    Mannor, S
    Rubinstein, RY
    [J]. ANNALS OF OPERATIONS RESEARCH, 2005, 134 (01) : 19 - 67
  • [2] Constrained Differentiable Cross-Entropy Method for Safe Model-based Reinforcement Learning
    Mottahedi, Sam
    Pavlak, Gregory S.
    [J]. PROCEEDINGS OF THE 2022 THE 9TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILDINGS, CITIES, AND TRANSPORTATION, BUILDSYS 2022, 2022, : 40 - 48
  • [3] ON THE PERFORMANCE OF THE CROSS-ENTROPY METHOD
    Hu, Jiaqiao
    Hu, Ping
    [J]. PROCEEDINGS OF THE 2009 WINTER SIMULATION CONFERENCE (WSC 2009 ), VOL 1-4, 2009, : 451 - 460
  • [4] On the Convergence of the Cross-Entropy Method
    L. Margolin
    [J]. Annals of Operations Research, 2005, 134 : 201 - 214
  • [5] On the convergence of the cross-entropy method
    Margolin, L
    [J]. ANNALS OF OPERATIONS RESEARCH, 2005, 134 (01) : 201 - 214
  • [6] A Tutorial on the Cross-Entropy Method
    Pieter-Tjerk de Boer
    Dirk P. Kroese
    Shie Mannor
    Reuven Y. Rubinstein
    [J]. Annals of Operations Research, 2005, 134 : 19 - 67
  • [7] A Simple Decentralized Cross-Entropy Method
    Zhang, Zichen
    Jin, Jun
    Jagersand, Martin
    Luo, Jun
    Schuurmans, Dale
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [8] Improved cross-entropy method for estimation
    Joshua C. C. Chan
    Dirk P. Kroese
    [J]. Statistics and Computing, 2012, 22 : 1031 - 1040
  • [9] Improved cross-entropy method for estimation
    Chan, Joshua C. C.
    Kroese, Dirk P.
    [J]. STATISTICS AND COMPUTING, 2012, 22 (05) : 1031 - 1040
  • [10] The cross-entropy method for network reliability estimation
    Hui, KP
    Bean, N
    Kraetzl, M
    Kroese, DP
    [J]. ANNALS OF OPERATIONS RESEARCH, 2005, 134 (01) : 101 - 118