An Efficient Nonlinear Regression Approach for Genome-wide Detection of Marginal and Interacting Genetic Variations

被引:2
|
作者
Lee, Seunghak [1 ]
Lozano, Aurelie [2 ]
Kambadur, Prabhanjan [3 ]
Xing, Eric P. [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, 5000 Forbes Ave, Pittsburgh, PA 15217 USA
[2] IBM Corp, TJ Watson Res Ctr, Yorktown Hts, NY USA
[3] Bloomberg LP, New York, NY USA
关键词
genome-wide association mapping; SNP-SNP interaction; piecewise linear model screening; stability selection; group lasso; ALZHEIMERS-DISEASE; LATE-ONSET; ASSOCIATION; LASSO; DOPAMINE;
D O I
10.1089/cmb.2015.0202
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Genome-wide association studies have revealed individual genetic variants associated with phenotypic traits such as disease risk and gene expressions. However, detecting pairwise interaction effects of genetic variants on traits still remains a challenge due to a large number of combinations of variants (approximate to 10(11) SNP pairs in the human genome), and relatively small sample sizes (typically <10(4)). Despite recent breakthroughs in detecting interaction effects, there are still several open problems, including: (1) how to quickly process a large number of SNP pairs, (2) how to distinguish between true signals and SNPs/SNP pairs merely correlated with true signals, (3) how to detect nonlinear associations between SNP pairs and traits given small sample sizes, and (4) how to control false positives. In this article, we present a unified framework, called SPHINX, which addresses the aforementioned challenges. We first propose a piecewise linear model for interaction detection, because it is simple enough to estimate model parameters given small sample sizes but complex enough to capture nonlinear interaction effects. Then, based on the piecewise linear model, we introduce randomized group lasso under stability selection, and a screening algorithm to address the statistical and computational challenges mentioned above. In our experiments, we first demonstrate that SPHINX achieves better power than existing methods for interaction detection under false positive control. We further applied SPHINX to late-onset Alzheimer's disease dataset, and report 16 SNPs and 17 SNP pairs associated with gene traits. We also present a highly scalable implementation of our screening algorithm, which can screen approximate to 118 billion candidates of associations on a 60-node cluster in <5.5 hours.
引用
收藏
页码:372 / 389
页数:18
相关论文
共 50 条
  • [1] An Efficient Nonlinear Regression Approach for Genome-Wide Detection of Marginal and Interacting Genetic Variations
    Lee, Seunghak
    Lozano, Aurelie
    Kambadur, Prabhanjan
    Xing, Eric P.
    RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY (RECOMB 2015), 2015, 9029 : 167 - 187
  • [2] Detection and application of genome-wide variations in peach for association and genetic relationship analysis
    Liping Guan
    Ke Cao
    Yong Li
    Jian Guo
    Qiang Xu
    Lirong Wang
    BMC Genetics, 20
  • [3] Detection and application of genome-wide variations in peach for association and genetic relationship analysis
    Guan, Liping
    Cao, Ke
    Li, Yong
    Guo, Jian
    Xu, Qiang
    Wang, Lirong
    BMC GENETICS, 2019, 20 (01)
  • [4] REMI: REGRESSION WITH MARGINAL INFORMATION AND ITS APPLICATION IN GENOME-WIDE ASSOCIATION STUDIES
    Huang, Jian
    Jiao, Yuling
    Liu, Jin
    Yang, Can
    STATISTICA SINICA, 2021, 31 (04) : 1985 - 2004
  • [5] A fast and efficient smoothing approach to LASSO regression and an application to a genome-wide association study for COPD
    Hahn, Georg
    Lutz, Sharon M.
    Laha, Nilanjana
    Lange, Christoph
    GENETIC EPIDEMIOLOGY, 2020, 44 (05) : 487 - 487
  • [6] Genome-wide mutation detection by interclonal genetic variation
    Revollo, Javier R.
    Dad, Azra
    McDaniel, Lea P.
    Pearce, Mason G.
    Dobrovolsky, Vasily N.
    MUTATION RESEARCH-GENETIC TOXICOLOGY AND ENVIRONMENTAL MUTAGENESIS, 2018, 829 : 61 - 69
  • [7] Genetic Variations and Health-Related Quality of Life (HRQOL): A Genome-Wide Study Approach
    Adjei, Araba A.
    Lopez, Camden L.
    Schaid, Daniel J.
    Sloan, Jeff A.
    Le-Rademacher, Jennifer G.
    Loprinzi, Charles L.
    Norman, Aaron D.
    Olson, Janet E.
    Couch, Fergus J.
    Beutler, Andreas S.
    Vachon, Celine M.
    Ruddy, Kathryn J.
    CANCERS, 2021, 13 (04) : 1 - 15
  • [8] Genome-wide detection of copy number variations in Tharparkar cattle
    Kumar, Harshit
    Panigrahi, Manjit
    Saravanan, K. A.
    Rajawat, Divya
    Parida, Subhashree
    Bhushan, Bharat
    Gaur, G. K.
    Dutt, Triveni
    Mishra, B. P.
    Singh, R. K.
    ANIMAL BIOTECHNOLOGY, 2023, 34 (02) : 448 - 455
  • [9] The genetic lesions associated with regression: A genome-wide search for destructive mutations in the cavefish genome
    Gross, J. B.
    Berning, D.
    Adams, H.
    Gross, Joshua
    INTEGRATIVE AND COMPARATIVE BIOLOGY, 2018, 58 : E83 - E83
  • [10] Genome-wide investigation on the genetic variations of rice disease resistance genes
    Sihai Yang
    Zhumei Feng
    Xiuyan Zhang
    Ke Jiang
    Xinqing Jin
    Yueyu Hang
    Jian-Qun Chen
    Dacheng Tian
    Plant Molecular Biology, 2006, 62 : 181 - 193