Transformed low-rank ANOVA models for high-dimensional variable selection

被引:9
|
作者
Jung, Yoonsuh [1 ]
Zhang, Hong [2 ]
Hu, Jianhua [3 ]
机构
[1] Korea Univ, Dept Stat, Seoul, South Korea
[2] Fudan Univ, Inst Biostat, Shanghai, Peoples R China
[3] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA
基金
美国国家卫生研究院; 新加坡国家研究基金会;
关键词
ANOVA; BIC; diverging number of parameters; high-dimensional variables; low rank; variable selection; CERVICAL-CANCER SUSCEPTIBILITY; REGRESSION; LASSO; GENE; CLASSIFICATION; POLYMORPHISM; ASSOCIATION; XRCC1; REGULARIZATION; LIKELIHOOD;
D O I
10.1177/0962280217753726
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
High-dimensional data are often encountered in biomedical, environmental, and other studies. For example, in biomedical studies that involve high-throughput omic data, an important problem is to search for genetic variables that are predictive of a particular phenotype. A conventional solution is to characterize such relationships through regression models in which a phenotype is treated as the response variable and the variables are treated as covariates; this approach becomes particularly challenging when the number of variables exceeds the number of samples. We propose a general framework for expressing the transformed mean of high-dimensional variables in an exponential distribution family via ANOVA models in which a low-rank interaction space captures the association between the phenotype and the variables. This alternative method transforms the variable selection problem into a well-posed problem with the number of observations larger than the number of variables. In addition, we propose a model selection criterion for the new model framework with a diverging number of parameters, and establish the consistency of the selection criterion. We demonstrate the appealing performance of the proposed method in terms of prediction and detection accuracy through simulations and real data analyses.
引用
收藏
页码:1230 / 1246
页数:17
相关论文
共 50 条
  • [21] High-Dimensional Optimization using Diagonal and Low-Rank Evolution Strategy
    Wu, Wei
    Li, Zhenhua
    2024 6TH INTERNATIONAL CONFERENCE ON DATA-DRIVEN OPTIMIZATION OF COMPLEX SYSTEMS, DOCS 2024, 2024, : 324 - 331
  • [22] Exploring high-dimensional optimization by sparse and low-rank evolution strategy
    Li, Zhenhua
    Wu, Wei
    Zhang, Qingfu
    Cai, Xinye
    SWARM AND EVOLUTIONARY COMPUTATION, 2025, 92
  • [23] JOINT VARIABLE AND RANK SELECTION FOR PARSIMONIOUS ESTIMATION OF HIGH-DIMENSIONAL MATRICES
    Bunea, Florentina
    She, Yiyuan
    Wegkamp, Marten H.
    ANNALS OF STATISTICS, 2012, 40 (05): : 2359 - 2388
  • [24] HIGH-DIMENSIONAL VARIABLE SELECTION
    Wasserman, Larry
    Roeder, Kathryn
    ANNALS OF STATISTICS, 2009, 37 (5A): : 2178 - 2201
  • [25] L2RM: Low-Rank Linear Regression Models for High-Dimensional Matrix Responses
    Kong, Dehan
    An, Baiguo
    Zhang, Jingwen
    Zhu, Hongtu
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (529) : 403 - 424
  • [26] Review: Reversed low-rank ANOVA model for transforming high dimensional genetic data into low dimension
    Yoonsuh Jung
    Jianhua Hu
    Journal of the Korean Statistical Society, 2019, 48 : 169 - 178
  • [27] Review: Reversed low-rank ANOVA model for transforming high dimensional genetic data into low dimension
    Jung, Yoonsuh
    Hu, Jianhua
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2019, 48 (02) : 169 - 178
  • [28] Variable selection and estimation for high-dimensional spatial autoregressive models
    Cai, Liqian
    Maiti, Tapabrata
    SCANDINAVIAN JOURNAL OF STATISTICS, 2020, 47 (02) : 587 - 607
  • [29] Variable selection in high-dimensional quantile varying coefficient models
    Tang, Yanlin
    Song, Xinyuan
    Wang, Huixia Judy
    Zhu, Zhongyi
    JOURNAL OF MULTIVARIATE ANALYSIS, 2013, 122 : 115 - 132
  • [30] FACTOR MODELS AND VARIABLE SELECTION IN HIGH-DIMENSIONAL REGRESSION ANALYSIS
    Kneip, Alois
    Sarda, Pascal
    ANNALS OF STATISTICS, 2011, 39 (05): : 2410 - 2447