Bidirectional Learning for Offline Infinite-width Model-based Optimization

被引:0
|
作者
Chen, Can [1 ]
Zhang, Yingxue [2 ]
Fu, Jie [3 ]
Liu, Xue [1 ]
Coates, Mark [1 ]
机构
[1] McGill Univ, Montreal, PQ, Canada
[2] Huawei Noahs Ark Lab, Montreal, PQ, Canada
[3] Beijing Acad Artificial Intelligence, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In offline model-based optimization, we strive to maximize a black-box objective function by only leveraging a static dataset of designs and their scores. This problem setting arises in numerous fields including the design of materials, robots, DNA sequences, and proteins. Recent approaches train a deep neural network (DNN) on the static dataset to act as a proxy function, and then perform gradient ascent on the existing designs to obtain potentially high-scoring designs. This methodology frequently suffers from the out-of-distribution problem where the proxy function often returns poor designs. To mitigate this problem, we propose BiDirectional learning for offline Infinite-width model-based optimization (BDI). BDI consists of two mappings: the forward mapping leverages the static dataset to predict the scores of the high-scoring designs, and the backward mapping leverages the highscoring designs to predict the scores of the static dataset. The backward mapping, neglected in previous work, can distill more information from the static dataset into the high-scoring designs, which effectively mitigates the out-of-distribution problem. For a finite-width DNN model, the loss function of the backward mapping is intractable and only has an approximate form, which leads to a significant deterioration of the design quality. We thus adopt an infinite-width DNN model, and propose to employ the corresponding neural tangent kernel to yield a closed-form loss for more accurate design updates. Experiments on various tasks verify the effectiveness of BDI. The code is available here.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Differentiable Physics Models for Real-world Offline Model-based Reinforcement Learning
    Lutter, Michael
    Silberbauer, Johannes
    Watson, Joe
    Peters, Jan
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4163 - 4170
  • [42] Bidirectional Model-Based Policy Optimization Based on Adaptive Gaussian Noise and Improved Confidence Weights
    Liu, Wei
    Liu, Mengyuan
    Jin, Bao
    Zhu, Yixin
    Gao, Qi
    Sun, Jiayang
    [J]. IEEE ACCESS, 2023, 11 : 90254 - 90268
  • [43] Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement Learning
    Hishinuma, Toru
    Senda, Kei
    [J]. IEEE ACCESS, 2023, 11 : 145579 - 145590
  • [44] Model-based reinforcement learning for infinite-horizon approximate optimal tracking
    Kamalapurkar, Rushikesh
    Andrews, Lindsey
    Walters, Patrick
    Dixon, Warren E.
    [J]. 2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 5083 - 5088
  • [45] Model-Based Reinforcement Learning for Infinite-Horizon Approximate Optimal Tracking
    Kamalapurkar, Rushikesh
    Andrews, Lindsey
    Walters, Patrick
    Dixon, Warren E.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (03) : 753 - 758
  • [46] MO2: MODEL-BASED OFFLINE OPTIONS
    Salter, Sasha
    Wulfmeier, Markus
    Tirumala, Dhruva
    Heess, Nicolas
    Riedmiller, Martin
    Hadsell, Raia
    Rao, Dushyant
    [J]. CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 199, 2022, 199
  • [47] Efficient hyperparameter optimization through model-based reinforcement learning
    Wu, Jia
    Chen, SenPeng
    Liu, XiYuan
    [J]. NEUROCOMPUTING, 2020, 409 : 381 - 393
  • [48] Model-Based Reinforcement Learning Method for Microgrid Optimization Scheduling
    Yao, Jinke
    Xu, Jiachen
    Zhang, Ning
    Guan, Yajuan
    [J]. SUSTAINABILITY, 2023, 15 (12)
  • [49] Model-Based Reinforcement Learning via Proximal Policy Optimization
    Sun, Yuewen
    Yuan, Xin
    Liu, Wenzhang
    Sun, Changyin
    [J]. 2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 4736 - 4740
  • [50] Deep Reinforcement Learning with Model-based Acceleration for Hyperparameter Optimization
    Chen, SenPeng
    Wu, Jia
    Chen, XiuYun
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 170 - 177