Comprehensive-GWAS: a pipeline for genome-wide association studies utilizing cross-validation to assess the predictivity of genetic variations

被引:1
|
作者
Dagasso, Gabrielle [1 ]
Yan, Yan [2 ]
Wang, Lipu [3 ]
Li, Longhai [4 ]
Kutcher, Randy [3 ]
Zhang, Wentao [5 ]
Jin, Lingling [6 ]
机构
[1] Thompson Rivers Univ, Dept Math & Stat, Kamloops, BC, Canada
[2] Thompson Rivers Univ, Dept Comp Sci, Kamloops, BC, Canada
[3] Univ Saskatchewan, Dept Plant Sci, Saskatoon, SK, Canada
[4] Univ Saskatchewan, Dept Math & Stat, Saskatoon, SK, Canada
[5] Natl Res Council Canada, Ottawa, ON, Canada
[6] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK, Canada
关键词
SOFTWARE; MODELS;
D O I
10.1109/BIBM49941.2020.9313355
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Genome-wide association studies is an important approach to associate genetic variations among individuals with a particular trait. Despite many GWAS programs have been developed based on different statistical models, their results could vary to a large extent. To obtain a more comprehensive and accurate set of associated SNPs with a trait, we present comprehensive-GWAS, a novel automated pipeline that allows a two-step wrapper model for seamless GWAS analyses between various programs involved in performing traditional GWAS analyses and machine learning methods with additional population structure analysis. It first performs population structure analysis, then executes multiple GWAS software and combines their results into a single SNP subset. After that, it selects relevant SNPs with high individual and/or joint effects from that SNP subset and assess the predictivity of the model using cross-validation by LASSO. The combined and validated "true" significant SNPs are output as Manhattan plot, QQ plot and statistical results for each trait. To demonstrate the utility of the comprehensive-GWAS pipeline, it was applied to 199 wheat varieties that were genotyped with 90K infinium SNP array and phenotyped for traits related to fusarium head blight (FHB) disease in greenhouse condition in the year 2019 with three replications. It pinpoints genome regions that are more likely to be responsible for FHB resistance. The results will contribute to characterizing the genetic architecture of wheat lines with the highest FHB resistance. The pipeline is publicly available at https://github.com/notTrivial/Comprehensive-GWAS.
引用
收藏
页码:1361 / 1367
页数:7
相关论文
共 50 条
  • [41] How genome-wide association studies (GWAS) made traditional candidate gene studies obsolete
    Duncan, Laramie E.
    Ostacher, Michael
    Ballon, Jacob
    [J]. NEUROPSYCHOPHARMACOLOGY, 2019, 44 (09) : 1518 - 1523
  • [42] Genome-Wide Association Studies and Comparison of Models and Cross-Validation Strategies for Genomic Prediction of Quality Traits in Advanced Winter Wheat Breeding Lines
    Kristensen, Peter S.
    Jahoor, Ahmed
    Andersen, Jeppe R.
    Cericola, Fabio
    Orabi, Jihad
    Janss, Luc L.
    Jensen, Just
    [J]. FRONTIERS IN PLANT SCIENCE, 2018, 9
  • [43] Comprehensive genetic diversity and genome-wide association studies revealed the genetic basis of avocado fruit quality traits
    Li, Jin
    Eltaher, Shamseldeen
    Freeman, Barbie
    Singh, Sukhwinder
    Ali, Gul Shad
    [J]. FRONTIERS IN PLANT SCIENCE, 2024, 15
  • [44] Integrative Genome-Wide Association Studies of eQTL and GWAS Data for Gout Disease Susceptibility
    Meng-tse Gabriel Lee
    Tzu-Chun Hsu
    Shyr-Chyr Chen
    Ya-Chin Lee
    Po-Hsiu Kuo
    Jenn-Hwai Yang
    Hsiu-Hao Chang
    Chien-Chang Lee
    [J]. Scientific Reports, 9
  • [45] Efficiency of genome-wide association studies in random cross populations
    José Marcelo Soriano Viana
    Gabriel Borges Mundim
    Hélcio Duarte Pereira
    Andréa Carla Bastos Andrade
    Fabyano Fonseca e Silva
    [J]. Molecular Breeding, 2017, 37
  • [46] Genome-wide association studies and the genetic dissection of complex traits
    Sebastiani, Paola
    Timofeev, Nadia
    Dworkis, Daniel A.
    Perls, Thomas T.
    Steinberg, Martin H.
    [J]. AMERICAN JOURNAL OF HEMATOLOGY, 2009, 84 (08) : 504 - 515
  • [47] Testing and genetic model selection in genome-wide association studies
    Loley, Christina
    Koenig, Inke R.
    Hothorn, Ludwig
    Ziegler, Andreas
    [J]. ANNALS OF HUMAN GENETICS, 2012, 76 : 420 - 420
  • [48] Testing and Genetic Model Selection in Genome-Wide Association Studies
    Loley, Christina
    Konig, Inke R.
    Hothorn, Ludwig
    Ziegler, Andreas
    [J]. GENETIC EPIDEMIOLOGY, 2012, 36 (02) : 149 - 149
  • [49] Genome-wide association studies (GWAS) of multiple disease resistance in spring barley.
    Gyawali, S.
    Amezrou, R.
    Chao, S.
    Bhardwaj, S. C.
    Brueggeman, R.
    Fernando, W. G. D.
    Verma, R. P. S.
    [J]. CANADIAN JOURNAL OF PLANT PATHOLOGY, 2017, 39 (04) : 558 - 559
  • [50] The power of genetic diversity in genome-wide association studies of lipids
    Sarah E. Graham
    Shoa L. Clarke
    Kuan-Han H. Wu
    Stavroula Kanoni
    Greg J. M. Zajac
    Shweta Ramdas
    Ida Surakka
    Ioanna Ntalla
    Sailaja Vedantam
    Thomas W. Winkler
    Adam E. Locke
    Eirini Marouli
    Mi Yeong Hwang
    Sohee Han
    Akira Narita
    Ananyo Choudhury
    Amy R. Bentley
    Kenneth Ekoru
    Anurag Verma
    Bhavi Trivedi
    Hilary C. Martin
    Karen A. Hunt
    Qin Hui
    Derek Klarin
    Xiang Zhu
    Gudmar Thorleifsson
    Anna Helgadottir
    Daniel F. Gudbjartsson
    Hilma Holm
    Isleifur Olafsson
    Masato Akiyama
    Saori Sakaue
    Chikashi Terao
    Masahiro Kanai
    Wei Zhou
    Ben M. Brumpton
    Humaira Rasheed
    Sanni E. Ruotsalainen
    Aki S. Havulinna
    Yogasudha Veturi
    QiPing Feng
    Elisabeth A. Rosenthal
    Todd Lingren
    Jennifer Allen Pacheco
    Sarah A. Pendergrass
    Jeffrey Haessler
    Franco Giulianini
    Yuki Bradford
    Jason E. Miller
    Archie Campbell
    [J]. Nature, 2021, 600 : 675 - 679