MULTIFIDELITY LINEAR REGRESSION FOR SCIENTIFIC MACHINE LEARNING FROM SCARCE DATA

被引:0
|
作者
Qian, Elizabeth [1 ,2 ]
Kang, Dayoung [1 ]
Sella, Vignesh [3 ]
Chaudhuri, Anirban [3 ]
机构
[1] Georgia Inst Technol, Sch Aerosp Engn, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, Sch Computat Sci & Engn, Atlanta, GA 30332 USA
[3] Univ Texas Austin, Oden Inst Computat Engn & Sci, Austin, TX USA
关键词
Multifidelity methods; scientific machine learning; multifidelity machine learning; control variates; SPECTRAL PROPERTIES; OPERATOR INFERENCE; MODEL-REDUCTION; MULTILEVEL; APPROXIMATION; NETWORKS;
D O I
10.3934/fods.2024049
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Machine learning (ML) methods, which fit data to the parameters of a given parameterized model class, have garnered significant interest as potential methods for learning surrogate models for complex engineering systems where traditional simulation is expensive. However, in many scientific and engineering settings, generating high-fidelity data to train ML models is expensive, and the available budget for generating training data is limited, making high-fidelity training data scarce. ML models trained on scarce data have high variance, resulting in poor expected generalization performance. We propose a new multifidelity training approach for scientific machine learning via linear regression that exploits the scientific context where data of varying fidelities and costs are available; for example, high-fidelity data may be generated by an expensive fully resolved physics simulation whereas lower-fidelity data may arise from a cheaper model based on simplifying assumptions. We use the multifidelity data within an approximate control variate framework to define new multifidelity Monte Carlo estimators for linear regression models. We provide bias and variance analysis of our new estimators that guarantee the approach's accuracy and improved robustness to scarce high-fidelity data. Numerical results demonstrate that our multifidelity training approach achieves similar accuracy to the standard high-fidelity-only approach, significantly reducing high-fidelity data requirements.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] Investigating Data Hierarchies in Multifidelity Machine Learning for Excitation Energies
    Vinod, Vivin
    Zaspel, Peter
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2025, 21 (06) : 3077 - 3091
  • [2] Multifidelity Surrogate Based on Single Linear Regression
    Zhang, Yiming
    Kim, Nam H.
    Park, Chanyoung
    Haftka, Raphael T.
    AIAA JOURNAL, 2018, 56 (12) : 4944 - 4952
  • [3] statistical regression and classification: from linear models to machine learning
    Maronna, Ricardo
    STATISTICAL PAPERS, 2020, 61 (02) : 917 - 918
  • [5] New Partially Linear Regression and Machine Learning Models Applied to Agronomic Data
    Rodrigues, Gabriela M.
    Ortega, Edwin M. M.
    Cordeiro, Gauss M.
    AXIOMS, 2023, 12 (11)
  • [6] Data Provenance Based System for Classification and Linear Regression in Distributed Machine Learning
    Khan, Muhammad Jahanzeb
    Wang, Ruoyu
    Sun, Daniel
    Li, Guoqiang
    STRUCTURED OBJECT-ORIENTED FORMAL LANGUAGE AND METHOD (SOFL+MSVL 2019), 2020, 12028 : 279 - 295
  • [7] Machine learning for shock compression of solids using scarce data
    Balakrishnan, Sangeeth
    VanGessel, Francis G.
    Barnes, Brian C.
    Doherty, Ruth M.
    Wilson, William H.
    Boukouvalas, Zois
    Fuge, Mark D.
    Chung, Peter W. W.
    JOURNAL OF APPLIED PHYSICS, 2023, 133 (15)
  • [8] Machine learning and big scientific data
    Hey, Tony
    Butler, Keith
    Jackson, Sam
    Thiyagalingam, Jeyarajan
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2020, 378 (2166):
  • [9] Multifidelity Machine Learning for Molecular Excitation Energies
    Vinod, Vivin
    Maity, Sayan
    Zaspel, Peter
    Kleinekathoefer, Ulrich
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2023, 19 (21) : 7658 - 7670
  • [10] Optimized multifidelity machine learning for quantum chemistry
    Vinod, Vivin
    Kleinekathoefer, Ulrich
    Zaspel, Peter
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (01):