Hierarchical Bayesian methods enable information sharing across regression problems on multiple groups of data. While standard practice is to model regression parameters (effects) as (1) exchangeable across the groups and (2) correlated to differing degrees across covariates, we show that this approach exhibits poor statistical performance when the number of covariates exceeds the number of groups. For instance, in statistical genetics, we might regress dozens of traits (defining groups) for thousands of individuals (responses) on up to millions of genetic variants (covariates). When an analyst has more covariates than groups, we argue that it is often preferable to instead model effects as (1) exchangeable across covariates and (2) correlated to differing degrees across groups. To this end, we propose a hierarchical model expressing our alternative perspective. We devise an empirical Bayes estimator for learning the degree of correlation between groups. We develop theory that demonstrates that our method outperforms the classic approach when the number of covariates dominates the number of groups, and corroborate this result empirically on several high-dimensional multiple regression and classification problems.
机构:
Harvard Pilgrim Hlth Care Inst, Dept Populat Med, Boston, MA 02215 USA
Harvard Med Sch, Boston, MA 02115 USAHarvard Pilgrim Hlth Care Inst, Dept Populat Med, Boston, MA 02215 USA
Yu, Tingting
论文数: 引用数:
h-index:
机构:
Ye, Shangyuan
Wang, Rui
论文数: 0引用数: 0
h-index: 0
机构:
Harvard Pilgrim Hlth Care Inst, Dept Populat Med, Boston, MA 02215 USA
Harvard Med Sch, Boston, MA 02115 USA
Harvard TH Chan Sch Publ Hlth, Dept Biostat, Boston, MA USAHarvard Pilgrim Hlth Care Inst, Dept Populat Med, Boston, MA 02215 USA
Wang, Rui
[J].
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE,
2024,
52
(03):
: 900
-
923