A Fast and Scalable Computational Framework for Large-Scale High-Dimensional Bayesian Optimal Experimental Design*

被引:10
|
作者
Wu, Keyi [1 ]
Chen, Peng [2 ]
Ghattas, Omar [3 ,4 ,5 ]
机构
[1] Univ Texas Austin, Dept Math, Austin, TX 78705 USA
[2] Georgia Inst Technol, Sch Computat Sci & Engn, Atlanta, GA 30308 USA
[3] Univ Texas Austin, Dept Geol Sci, Austin, TX 78705 USA
[4] Univ Texas Austin, Dept Mech Engn, Austin, TX 78705 USA
[5] Univ Texas Austin, Oden Inst Computat Engn & Sci, Austin, TX 78705 USA
来源
基金
美国国家科学基金会;
关键词
optimal experimental design; Bayesian inverse problems; expected information gain; swapping greedy algorithm; low-rank approximation; offline-online decomposition; LINEAR INVERSE PROBLEMS; STOCHASTIC NEWTON MCMC; A-OPTIMAL DESIGN; UNCERTAINTY QUANTIFICATION; TAYLOR APPROXIMATION; INFORMATION; ALGORITHMS; PARAMETERS; BOUNDARY; FLOW;
D O I
10.1137/21M1466499
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
We develop a fast and scalable computational framework to solve Bayesian optimal experimental design problems governed by partial differential equations (PDEs) with application to optimal sensor placement by maximizing expected information gain (EIG). Such problems are particularly challenging due to the curse of dimensionality for high-dimensional parameters and the expensive solution of large-scale PDEs. To address these challenges, we exploit two fundamental properties: (1) the low-rank structure of the Jacobian of the parameter-to-observable map, to extract the intrinsically low-dimensional data-informed subspace, and (2) a series of approximations of the EIG that reduce the number of PDE solves while retaining high correlation with the true EIG. Based on these properties, we propose an efficient offline-online decomposition for the optimization problem. The offline stage dominates the cost and entails precomputing all components that require PDE solves. The online stage optimizes sensor placement and does not require any PDE solves. For the online stage, we propose a new greedy algorithm that first places an initial set of sensors using leverage scores and then swaps the selected sensors with other candidates until certain convergence criteria are met, which we call a swapping greedy algorithm. We demonstrate the efficiency and scalability of the proposed method by both linear and nonlinear inverse problems. In particular, we show that the number of required PDE solves is small, independent of the parameter dimension, and only weakly dependent on the data dimension for both problems.
引用
收藏
页码:235 / 261
页数:27
相关论文
共 50 条
  • [1] LARGE-SCALE HIGH-DIMENSIONAL CLUSTERING WITH FAST SKETCHING
    Chatalic, Antoine
    Gribonval, Remi
    Keriven, Nicolas
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4714 - 4718
  • [2] Batched Large-scale Bayesian Optimization in High-dimensional Spaces
    Wang, Zi
    Gehring, Clement
    Kohli, Pushmeet
    Jegelka, Stefanie
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [3] A fast classification strategy for SVM on the large-scale high-dimensional datasets
    Li, I-Jing
    Wu, Jiunn-Lin
    Yeh, Chih-Hung
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2018, 21 (04) : 1023 - 1038
  • [4] A fast classification strategy for SVM on the large-scale high-dimensional datasets
    I-Jing Li
    Jiunn-Lin Wu
    Chih-Hung Yeh
    [J]. Pattern Analysis and Applications, 2018, 21 : 1023 - 1038
  • [5] Visualizing Large-scale and High-dimensional Data
    Tang, Jian
    Liu, Jingzhou
    Zhang, Ming
    Mei, Qiaozhu
    [J]. PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'16), 2016, : 287 - 297
  • [6] Fast Low-rank Metric Learning for Large-scale and High-dimensional Data
    Liu, Han
    Han, Zhizhong
    Liu, Yu-Shen
    Gu, Ming
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [7] High-dimensional and large-scale phenotyping of yeast mutants
    Ohya, Y
    Sese, J
    Yukawa, M
    Sano, F
    Nakatani, Y
    Saito, TL
    Saka, A
    Fukuda, T
    Ishihara, S
    Oka, S
    Suzuki, G
    Watanabe, M
    Hirata, A
    Ohtani, M
    Sawai, H
    Fraysse, N
    Latgé, JP
    François, JM
    Aebi, M
    Tanaka, S
    Muramatsu, S
    Araki, H
    Sonoike, K
    Nogami, S
    Morishita, S
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (52) : 19015 - 19020
  • [8] AliGater: a framework for the development of bioinformatic pipelines for large-scale, high-dimensional cytometry data
    Ekdahl, Ludvig
    Arrizabalaga, Antton Lamarca
    Ali, Zain
    Cafaro, Caterina
    de Lapuente Portilla, Aitzkoa Lopez
    Nilsson, Bjorn
    [J]. NEURO-ONCOLOGY ADVANCES, 2023, 5 (01)
  • [9] A fast and scalable framework for large-scale and ultrahigh-dimensional sparse regression with application to the UK Biobank
    Qian, Junyang
    Tanigawa, Yosuke
    Du, Wenfei
    Aguirre, Matthew
    Chang, Chris
    Tibshirani, Robert
    Rivas, Manuel A.
    Hastie, Trevor
    [J]. PLOS GENETICS, 2020, 16 (10):
  • [10] Scalable Collaborative Targeted Learning for Large Scale and High-Dimensional Data
    Ju, Cheng
    Gruber, Susan
    Lendle, Samuel D.
    Franklin, Jessica M.
    Wyss, Richard
    Schneeweiss, Sebastian
    van der Laan, Mark J.
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2017, 26 : 529 - 530