Transformed low-rank ANOVA models for high-dimensional variable selection

被引:9
|
作者
Jung, Yoonsuh [1 ]
Zhang, Hong [2 ]
Hu, Jianhua [3 ]
机构
[1] Korea Univ, Dept Stat, Seoul, South Korea
[2] Fudan Univ, Inst Biostat, Shanghai, Peoples R China
[3] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA
基金
美国国家卫生研究院; 新加坡国家研究基金会;
关键词
ANOVA; BIC; diverging number of parameters; high-dimensional variables; low rank; variable selection; CERVICAL-CANCER SUSCEPTIBILITY; REGRESSION; LASSO; GENE; CLASSIFICATION; POLYMORPHISM; ASSOCIATION; XRCC1; REGULARIZATION; LIKELIHOOD;
D O I
10.1177/0962280217753726
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
High-dimensional data are often encountered in biomedical, environmental, and other studies. For example, in biomedical studies that involve high-throughput omic data, an important problem is to search for genetic variables that are predictive of a particular phenotype. A conventional solution is to characterize such relationships through regression models in which a phenotype is treated as the response variable and the variables are treated as covariates; this approach becomes particularly challenging when the number of variables exceeds the number of samples. We propose a general framework for expressing the transformed mean of high-dimensional variables in an exponential distribution family via ANOVA models in which a low-rank interaction space captures the association between the phenotype and the variables. This alternative method transforms the variable selection problem into a well-posed problem with the number of observations larger than the number of variables. In addition, we propose a model selection criterion for the new model framework with a diverging number of parameters, and establish the consistency of the selection criterion. We demonstrate the appealing performance of the proposed method in terms of prediction and detection accuracy through simulations and real data analyses.
引用
收藏
页码:1230 / 1246
页数:17
相关论文
共 50 条
  • [41] High-dimensional variable selection via low-dimensional adaptive learning
    Staerk, Christian
    Kateri, Maria
    Ntzoufras, Ioannis
    ELECTRONIC JOURNAL OF STATISTICS, 2021, 15 (01): : 830 - 879
  • [42] A Compact High-Dimensional Yield Analysis Method using Low-Rank Tensor Approximation
    Shi, Xiao
    Yan, Hao
    Huang, Qiancun
    Xuan, Chengzhen
    He, Lei
    Shi, Longxing
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2022, 27 (02)
  • [43] Fast Low-rank Metric Learning for Large-scale and High-dimensional Data
    Liu, Han
    Han, Zhizhong
    Liu, Yu-Shen
    Gu, Ming
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [44] A low-rank complexity reduction algorithm for the high-dimensional kinetic chemical master equation
    Einkemmer, Lukas
    Mangott, Julian
    Prugger, Martina
    JOURNAL OF COMPUTATIONAL PHYSICS, 2024, 503
  • [45] Accelerated High-Dimensional MR Imaging With Sparse Sampling Using Low-Rank Tensors
    He, Jingfei
    Liu, Qiegen
    Christodoulou, Anthony G.
    Ma, Chao
    Lam, Fan
    Liang, Zhi-Pei
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2016, 35 (09) : 2119 - 2129
  • [46] Tensor Convolution-Like Low-Rank Dictionary for High-Dimensional Image Representation
    Xue, Jize
    Zhao, Yong-Qiang
    Wu, Tongle
    Chan, Jonathan Cheung-Wai
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 13257 - 13270
  • [47] Variable Selection in High-Dimensional Partially Linear Models with Longitudinal Data
    Yang Yiping
    Xue Liugen
    RECENT ADVANCE IN STATISTICS APPLICATION AND RELATED AREAS, VOLS I AND II, 2009, : 661 - 667
  • [48] VARIABLE SELECTION FOR HIGH-DIMENSIONAL GENERALIZED VARYING-COEFFICIENT MODELS
    Lian, Heng
    STATISTICA SINICA, 2012, 22 (04) : 1563 - 1588
  • [49] Variable selection in high-dimensional sparse multiresponse linear regression models
    Luo, Shan
    STATISTICAL PAPERS, 2020, 61 (03) : 1245 - 1267
  • [50] A consistent variable selection criterion for linear models with high-dimensional covariates
    Zheng, XD
    Loh, WY
    STATISTICA SINICA, 1997, 7 (02) : 311 - 325