Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling

被引:0
|
作者
Grathwohl, Will [1 ,2 ]
Wang, Kuan-Chieh [1 ,2 ]
Jacobsen, Jorn-Henrik [1 ,2 ]
Duvenaud, David [1 ,2 ]
Zemel, Richard [1 ,2 ]
机构
[1] Univ Toronto, Toronto, ON, Canada
[2] Vector Inst, Toronto, ON, Canada
关键词
NETWORKS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present a new method for evaluating and training unnormalized density models. Our approach only requires access to the gradient of the unnormalized model's log-density. We estimate the Stein discrepancy between the data density p(x) and the model density q(x) defined by a vector function of the data. We parameterize this function with a neural network and fit its parameters to maximize the discrepancy. This yields a novel goodness-of-fit test which outperforms existing methods on high dimensional data. Furthermore, optimizing q(x) to minimize this discrepancy produces a novel method for training unnormalized models which scales more gracefully than existing methods. The ability to both learn and compare models is a unique feature of the proposed method.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling
    Grathwohl, Will
    Wang, Kuan-Chieh
    Jacobsen, Jorn-Henrik
    Duvenaud, David
    Zemel, Richard
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [2] Learning Kernel Stein Discrepancy for Training Energy-Based Models
    Niu, Lu
    Li, Shaobo
    Li, Zhenping
    APPLIED SCIENCES-BASEL, 2023, 13 (22):
  • [3] Learning Energy-Based Models with Adversarial Training
    Yin, Xuwang
    Li, Shiying
    Rohde, Gustavo K.
    COMPUTER VISION - ECCV 2022, PT V, 2022, 13665 : 209 - 226
  • [4] Perturb-and-max-product: Sampling and learning in discrete energy-based models
    Lazaro-Gredilla, Miguel
    Dedieu, Antoine
    George, Dileep
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] Improved Contrastive Divergence Training of Energy-Based Models
    Du, Yilun
    Li, Shuang
    Tenenbaum, Joshua
    Mordatch, Igor
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [6] Energy-based Data Sampling for Traffic Prediction with Small Training Datasets
    Yang, Zhaohui
    Jerath, Kshitij
    IFAC PAPERSONLINE, 2024, 58 (28): : 738 - 743
  • [7] Training Energy-Based Models for Time-Series Imputation
    Brakel, Philemon
    Stroobandt, Dirk
    Schrauwen, Benjamin
    JOURNAL OF MACHINE LEARNING RESEARCH, 2013, 14 : 2771 - 2797
  • [8] Efficient training of energy-based models using Jarzynski equality
    Carbone, Davide
    Hua, Mengjian
    Coste, Simon
    Vanden-Eijnden, Eric
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2024, 2024 (10):
  • [9] Pre-Training Transformers as Energy-Based Cloze Models
    Clark, Kevin
    Luong, Minh-Thang
    Le, Quoc V.
    Manning, Christopher D.
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 285 - 294
  • [10] Efficient Training of Energy-Based Models Using Jarzynski Equality
    Carbone, Davide
    Hua, Mengjian
    Coste, Simon
    Vanden-Eijnden, Eric
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,