Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies

被引:177
|
作者
Tamba, Cox Lwaka [1 ,2 ]
Ni, Yuan-Li [1 ]
Zhang, Yuan-Ming [1 ,3 ]
机构
[1] Nanjing Agr Univ, State Key Lab Crop Genet & Germplasm Enhancement, Nanjing, Jiangsu, Peoples R China
[2] Egerton Univ, Dept Math, Egerton, Kenya
[3] Huazhong Agr Univ, Stat Genom Lab, Coll Plant Sci & Technol, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
PENALIZED LOGISTIC-REGRESSION; VARIABLE SELECTION; ORACLE PROPERTIES;
D O I
10.1371/journal.pcbi.1005357
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Genome-wide association study (GWAS) entails examining a large number of single nucleotide polymorphisms (SNPs) in a limited sample with hundreds of individuals, implying a variable selection problem in the high dimensional dataset. Although many single-locus GWAS approaches under polygenic background and population structure controls have been widely used, some significant loci fail to be detected. In this study, we used an iterative modified-sure independence screening (ISIS) approach in reducing the number of SNPs to a moderate size. Expectation-Maximization (EM)-Bayesian least absolute shrinkage and selection operator (BLASSO) was used to estimate all the selected SNP effects for true quantitative trait nucleotide (QTN) detection. This method is referred to as ISIS EM-BLASSO algorithm. Monte Carlo simulation studies validated the new method, which has the highest empirical power in QTN detection and the highest accuracy in QTN effect estimation, and it is the fastest, as compared with efficient mixed-model association (EMMA), smoothly clipped absolute deviation (SCAD), fixed and random model circulating probability unification (FarmCPU), and multi-locus random-SNP-effect mixed linear model (mrMLM). To further demonstrate the new method, six flowering time traits in Arabidopsis thaliana were re-analyzed by four methods (New method, EMMA, FarmCPU, and mrMLM). As a result, the new method identified most previously reported genes. Therefore, the new method is a good alternative for multi-locus GWAS.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] mrMLM v4.0.2: An R Platform for Multi-locus Genome-wide Association Studies
    YaWen Zhang
    Cox Lwaka Tamba
    YangJun Wen
    Pei Li
    WenLong Ren
    YuanLi Ni
    Jun Gao
    YuanMing Zhang
    [J]. Genomics,Proteomics & Bioinformatics., 2020, (04) - 487
  • [22] mrMLM v4.0.2: An R Platform for Multi-locus Genome-wide Association Studies
    Ya-Wen Zhang
    Cox Lwaka Tamba
    Yang-Jun Wen
    Pei Li
    Wen-Long Ren
    Yuan-Li Ni
    Jun Gao
    Yuan-Ming Zhang
    [J]. Genomics,Proteomics & Bioinformatics, 2020, 18 (04) : 481 - 487
  • [23] Eagle: multi-locus association mapping on a genome-wide scale made routine
    George, Andrew W.
    Verbyla, Arunas
    Bowden, Joshua
    [J]. BIOINFORMATICS, 2020, 36 (05) : 1509 - 1516
  • [24] Searching Genome-Wide Multi-Locus Associations for Multiple Diseases Based on Bayesian Inference
    Guo, Xuan
    Zhang, Jing
    Cai, Zhipeng
    Du, Ding-Zhu
    Pan, Yi
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (03) : 600 - 610
  • [25] Multi-locus genome-wide association studies reveal the dynamic genetic architecture of flowering time in chrysanthemum
    Jiangshuo Su
    Junwei Zeng
    Siyue Wang
    Xuefeng Zhang
    Limin Zhao
    Shiyun Wen
    Fei Zhang
    Jiafu Jiang
    Fadi Chen
    [J]. Plant Cell Reports, 2024, 43
  • [26] An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations
    Segura, Vincent
    Vilhjalmsson, Bjarni J.
    Platt, Alexander
    Korte, Arthur
    Seren, Uemit
    Long, Quan
    Nordborg, Magnus
    [J]. NATURE GENETICS, 2012, 44 (07) : 825 - U144
  • [27] Multi-locus Test Conditional on Confirmed Effects Leads to Increased Power in Genome-wide Association Studies
    Ma, Li
    Han, Shizhong
    Yang, Jing
    Da, Yang
    [J]. PLOS ONE, 2010, 5 (11):
  • [28] An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations
    Vincent Segura
    Bjarni J Vilhjálmsson
    Alexander Platt
    Arthur Korte
    Ümit Seren
    Quan Long
    Magnus Nordborg
    [J]. Nature Genetics, 2012, 44 : 825 - 830
  • [29] Multi-locus genome-wide association studies reveal the dynamic genetic architecture of flowering time in chrysanthemum
    Su, Jiangshuo
    Zeng, Junwei
    Wang, Siyue
    Zhang, Xuefeng
    Zhao, Limin
    Wen, Shiyun
    Zhang, Fei
    Jiang, Jiafu
    Chen, Fadi
    [J]. PLANT CELL REPORTS, 2024, 43 (04)
  • [30] Multi-stage multi-locus analysis of genome-wide association studies using Random Forests and Logistic Regression
    Parisi, R.
    Bishop, D. T.
    Iles, M. M.
    Barrett, J. H.
    [J]. ANNALS OF HUMAN GENETICS, 2009, 73 : 666 - 666