For high-dimensional hierarchical models, consider exchangeability of effects across covariates instead of across datasets

被引:0
|
作者
Trippe, Brian L. [1 ]
Finucane, Hilary K. [2 ]
Broderick, Tamara [1 ]
机构
[1] MIT, CSAIL, Cambridge, MA 02139 USA
[2] Broad Inst, Cambridge, MA USA
关键词
SEEMINGLY UNRELATED REGRESSION; EMPIRICAL BAYES ESTIMATORS; VARIABLE SELECTION; PREDICTION; DISEASES; RISK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical Bayesian methods enable information sharing across regression problems on multiple groups of data. While standard practice is to model regression parameters (effects) as (1) exchangeable across the groups and (2) correlated to differing degrees across covariates, we show that this approach exhibits poor statistical performance when the number of covariates exceeds the number of groups. For instance, in statistical genetics, we might regress dozens of traits (defining groups) for thousands of individuals (responses) on up to millions of genetic variants (covariates). When an analyst has more covariates than groups, we argue that it is often preferable to instead model effects as (1) exchangeable across covariates and (2) correlated to differing degrees across groups. To this end, we propose a hierarchical model expressing our alternative perspective. We devise an empirical Bayes estimator for learning the degree of correlation between groups. We develop theory that demonstrates that our method outperforms the classic approach when the number of covariates dominates the number of groups, and corroborate this result empirically on several high-dimensional multiple regression and classification problems.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Conditional Approximate Bayesian Computation: A New Approach for Across-Site Dependency in High-Dimensional Mutation-Selection Models
    Laurin-Lemay, Simon
    Rodrigue, Nicolas
    Lartillot, Nicolas
    Philippe, Herve
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2018, 35 (11) : 2819 - 2834
  • [32] Bayesian hierarchical models for high-dimensional mediation analysis with coordinated selection of correlated mediators
    Song, Yanyi
    Zhou, Xiang
    Kang, Jian
    Aung, Max T.
    Zhang, Min
    Zhao, Wei
    Needham, Belinda L.
    Kardia, Sharon L. R.
    Liu, Yongmei
    Meeker, John D.
    Smith, Jennifer A.
    Mukherjee, Bhramar
    [J]. STATISTICS IN MEDICINE, 2021, 40 (27) : 6038 - 6056
  • [33] A Regularized Hierarchical Regression Framework for Incorporating External Information in High-Dimensional Prediction Models
    Weaver, Garrett M.
    Lewinger, Juan Pablo
    [J]. GENETIC EPIDEMIOLOGY, 2017, 41 (07) : 705 - 705
  • [34] Ensemble Kalman Methods for High-Dimensional Hierarchical Dynamic Space-Time Models
    Katzfuss, Matthias
    Stroud, Jonathan R.
    Wikle, Christopher K.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (530) : 866 - 885
  • [35] Hierarchical time-varying mixed-effects models in high-dimensional time series and longitudinal data studies
    Li, Jinglan
    Zhang, Zhengjun
    [J]. JOURNAL OF NONPARAMETRIC STATISTICS, 2019, 31 (03) : 695 - 721
  • [36] JOINT BAYESIAN VARIABLE AND DAG SELECTION CONSISTENCY FOR HIGH-DIMENSIONAL REGRESSION MODELS WITH NETWORK-STRUCTURED COVARIATES
    Cao, Xuan
    Lee, Kyoungjae
    [J]. STATISTICA SINICA, 2021, 31 (03) : 1509 - 1530
  • [37] L 1 penalized continuation ratio models for ordinal response prediction using high-dimensional datasets
    Archer, K. J.
    Williams, A. A. A.
    [J]. STATISTICS IN MEDICINE, 2012, 31 (14) : 1464 - 1474
  • [38] High-dimensional variable selection accounting for heterogeneity in regression coefficients across multiple data sources
    Yu, Tingting
    Ye, Shangyuan
    Wang, Rui
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2024, 52 (03): : 900 - 923
  • [39] Inference and Estimation for Random Effects in High-Dimensional Linear Mixed Models
    Law, Michael
    Ritov, Ya'acov
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (543) : 1682 - 1691
  • [40] Optimal statistical inference for individualized treatment effects in high-dimensional models
    Cai, Tianxi
    Cai, T. Tony
    Guo, Zijian
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2021, 83 (04) : 669 - 719