VARIABLE SELECTION METHOD FOR THE IDENTIFICATION OF EPISTATIC MODELS

被引:0
|
作者
Holzinger, Emily Rose [1 ]
Szymczak, Silke [1 ]
Dasgupta, Abhijit [2 ]
Malley, James [3 ]
Li, Qing [1 ]
Bailey-Wilson, Joan E. [1 ]
机构
[1] NHGRI, Computat & Stat Genom Branch, NIH, Baltimore, MD 21224 USA
[2] NIAMS, Clin Trials & Outcomes Branch, NIH, Bethesda, MD 20892 USA
[3] NIH, Ctr Informat Technol, Bethesda, MD 20892 USA
关键词
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Standard analysis methods for genome wide association studies (GWAS) are not robust to complex disease models, such as interactions between variables with small main effects. These types of effects likely contribute to the heritability of complex human traits. Machine learning methods that are capable of identifying interactions, such as Random Forests (RF), are an alternative analysis approach. One caveat to RF is that there is no standardized method of selecting variables so that false positives are reduced while retaining adequate power. To this end, we have developed a novel variable selection method called relative recurrency variable importance metric (r2VIM). This method incorporates recurrency and variance estimation to assist in optimal threshold selection. For this study, we specifically address how this method performs in data with almost completely epistatic effects (i.e. no marginal effects). Our results show that with appropriate parameter settings, r2VIM can identify interaction effects when the marginal effects are virtually nonexistent It also outperf`orms logistic regression, which has essentially no power under this type of model when the number of potential features (genetic variants) is large. (All Supplementary Data can be found here: http://research.nhgri.nih.gov/manuscripts/Bailey-Wilson/r2VIM_epi/).
引用
收藏
页码:195 / 206
页数:12
相关论文
共 50 条
  • [21] Variable selection for uncertain regression models based on elastic net method
    Zhang, Guidong
    Zhao, Wenzhi
    Sheng, Yuhong
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024,
  • [22] An efficient and robust variable selection method for longitudinal generalized linear models
    Lv, Jing
    Yang, Hu
    Guo, Chaohui
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2015, 82 : 74 - 88
  • [23] Selection of the optimization method for identification of phase transformation models for steels
    Bachniak, Daniel
    Rauch, Lukasz
    Pietrzyk, Maciej
    Kusiak, Jan
    MATERIALS AND MANUFACTURING PROCESSES, 2017, 32 (11) : 1248 - 1259
  • [24] Structural identification and variable selection in high-dimensional varying-coefficient models
    Chen, Yuping
    Bai, Yang
    Fung, Wingkam
    JOURNAL OF NONPARAMETRIC STATISTICS, 2017, 29 (02) : 258 - 279
  • [25] A novel regularization method for estimation and variable selection in multi-index models
    Zeng, Peng
    Zhu, Yu
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2019, 48 (12) : 3055 - 3067
  • [26] A two-stage method for improving discrimination and variable selection in DEA models
    Xie, Qiwei
    Li, Rong
    Zou, Yanping
    Liu, Yujia
    Wang, Xiaojiong
    IMA JOURNAL OF MANAGEMENT MATHEMATICS, 2022, 33 (03) : 511 - 529
  • [27] A method for marker-assisted selection based on QTLs with epistatic effects
    Liu, PY
    Zhu, J
    Lou, XY
    Lu, Y
    GENETICA, 2003, 119 (01) : 75 - 86
  • [28] Component selection and variable selection for mixture regression models
    Qi, Xuefei
    Xu, Xingbai
    Feng, Zhenghui
    Peng, Heng
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2025, 206
  • [29] A Method for Marker-Assisted Selection Based on QTLs with Epistatic Effects
    Pengyuan Liu
    Jun Zhu
    Xiangyang Lou
    Yan Lu
    Genetica, 2003, 119 : 75 - 86
  • [30] Epistatic models and pre-selection of markers improve prediction of performance in corn
    John W. Dudley
    G. Richard Johnson
    Molecular Breeding, 2013, 32 : 585 - 593