MULTIFIDELITY LINEAR REGRESSION FOR SCIENTIFIC MACHINE LEARNING FROM SCARCE DATA

被引：0

作者：

Qian, Elizabeth ^{[1
,2
]}

Kang, Dayoung ^{[1
]}

Sella, Vignesh ^{[3
]}

Chaudhuri, Anirban ^{[3
]}

机构：

[1] Georgia Inst Technol, Sch Aerosp Engn, Atlanta, GA 30332 USA

[2] Georgia Inst Technol, Sch Computat Sci & Engn, Atlanta, GA 30332 USA

[3] Univ Texas Austin, Oden Inst Computat Engn & Sci, Austin, TX USA

来源：

FOUNDATIONS OF DATA SCIENCE | 2024年

关键词：

Multifidelity methods; scientific machine learning; multifidelity machine learning; control variates; SPECTRAL PROPERTIES; OPERATOR INFERENCE; MODEL-REDUCTION; MULTILEVEL; APPROXIMATION; NETWORKS;

D O I：

10.3934/fods.2024049

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

Machine learning (ML) methods, which fit data to the parameters of a given parameterized model class, have garnered significant interest as potential methods for learning surrogate models for complex engineering systems where traditional simulation is expensive. However, in many scientific and engineering settings, generating high-fidelity data to train ML models is expensive, and the available budget for generating training data is limited, making high-fidelity training data scarce. ML models trained on scarce data have high variance, resulting in poor expected generalization performance. We propose a new multifidelity training approach for scientific machine learning via linear regression that exploits the scientific context where data of varying fidelities and costs are available; for example, high-fidelity data may be generated by an expensive fully resolved physics simulation whereas lower-fidelity data may arise from a cheaper model based on simplifying assumptions. We use the multifidelity data within an approximate control variate framework to define new multifidelity Monte Carlo estimators for linear regression models. We provide bias and variance analysis of our new estimators that guarantee the approach's accuracy and improved robustness to scarce high-fidelity data. Numerical results demonstrate that our multifidelity training approach achieves similar accuracy to the standard high-fidelity-only approach, significantly reducing high-fidelity data requirements.

引用

页数：27

共 50 条

[1] Investigating Data Hierarchies in Multifidelity Machine Learning for Excitation Energies
Vinod, Vivin
Zaspel, Peter
JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2025, 21 (06) : 3077 - 3091
[2] Multifidelity Surrogate Based on Single Linear Regression
Zhang, Yiming
Kim, Nam H.
Park, Chanyoung
Haftka, Raphael T.
AIAA JOURNAL, 2018, 56 (12) : 4944 - 4952
[3] statistical regression and classification: from linear models to machine learning
Maronna, Ricardo
STATISTICAL PAPERS, 2020, 61 (02) : 917 - 918
[4] Statistical Regression and Classification: from Linear Models to Machine Learning
Kumar, Kuldeep
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2018, 181 (04) : 1263 - 1264
[5] New Partially Linear Regression and Machine Learning Models Applied to Agronomic Data
Rodrigues, Gabriela M.
Ortega, Edwin M. M.
Cordeiro, Gauss M.
AXIOMS, 2023, 12 (11)
[6] Data Provenance Based System for Classification and Linear Regression in Distributed Machine Learning
Khan, Muhammad Jahanzeb
Wang, Ruoyu
Sun, Daniel
Li, Guoqiang
STRUCTURED OBJECT-ORIENTED FORMAL LANGUAGE AND METHOD (SOFL+MSVL 2019), 2020, 12028 : 279 - 295
[7] Machine learning for shock compression of solids using scarce data
Balakrishnan, Sangeeth
VanGessel, Francis G.
Barnes, Brian C.
Doherty, Ruth M.
Wilson, William H.
Boukouvalas, Zois
Fuge, Mark D.
Chung, Peter W. W.
JOURNAL OF APPLIED PHYSICS, 2023, 133 (15)
[8] Machine learning and big scientific data
Hey, Tony
Butler, Keith
Jackson, Sam
Thiyagalingam, Jeyarajan
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2020, 378 (2166):
[9] Multifidelity Machine Learning for Molecular Excitation Energies
Vinod, Vivin
Maity, Sayan
Zaspel, Peter
Kleinekathoefer, Ulrich
JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2023, 19 (21) : 7658 - 7670
[10] Optimized multifidelity machine learning for quantum chemistry
Vinod, Vivin
Kleinekathoefer, Ulrich
Zaspel, Peter
MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (01):

← 1 2 3 4 5 →