Reaching the End-Game for GWAS: Machine Learning Approaches for the Prioritization of Complex Disease Loci

被引:71
|
作者
Nicholls, Hannah L. [1 ,2 ]
John, Christopher R. [2 ,3 ]
Watson, David S. [2 ,4 ]
Munroe, Patricia B. [1 ,5 ]
Barnes, Michael R. [1 ,2 ,5 ,6 ]
Cabrera, Claudia P. [1 ,2 ,5 ]
机构
[1] Queen Mary Univ London, Barts & London Sch Med & Dent, William Harvey Res Inst, Clin Pharmacol, London, England
[2] Queen Mary Univ London, Barts & London Sch Med & Dent, William Harvey Res Inst, Ctr Translat Bioinformat, London, England
[3] Queen Mary Univ London, Barts & London Sch Med & Dent, William Harvey Res Inst, Ctr Expt Med & Rheumatol, London, England
[4] Univ Oxford, Oxford Internet Inst, Oxford, England
[5] Queen Mary Univ London, Barts & London Sch Med & Dent, NIHR Barts Biomed Res Ctr, London, England
[6] Alan Turing Inst, British Lib, London, England
关键词
machine learning; artificial intelligence; genome-wide association study; genomics; candidate gene; clinical translation; deep learning; data science; GENOME-WIDE ASSOCIATION; VARIABLE SELECTION; GENE; RISK; SCHIZOPHRENIA; IMPACT;
D O I
10.3389/fgene.2020.00350
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Genome-wide association studies (GWAS) have revealed thousands of genetic loci that underpin the complex biology of many human traits. However, the strength of GWAS - the ability to detect genetic association by linkage disequilibrium (LD) - is also its limitation. Whilst the ever-increasing study size and improved design have augmented the power of GWAS to detect effects, differentiation of causal variants or genes from other highly correlated genes associated by LD remains the real challenge. This has severely hindered the biological insights and clinical translation of GWAS findings. Although thousands of disease susceptibility loci have been reported, causal genes at these loci remain elusive. Machine learning (ML) techniques offer an opportunity to dissect the heterogeneity of variant and gene signals in the post-GWAS analysis phase. ML models for GWAS prioritization vary greatly in their complexity, ranging from relatively simple logistic regression approaches to more complex ensemble models such as random forests and gradient boosting, as well as deep learning models, i.e., neural networks. Paired with functional validation, these methods show important promise for clinical translation, providing a strong evidence-based approach to direct post-GWAS research. However, as ML approaches continue to evolve to meet the challenge of causal gene identification, a critical assessment of the underlying methodologies and their applicability to the GWAS prioritization problem is needed. This review investigates the landscape of ML applications in three parts: selected models, input features, and output model performance, with a focus on prioritizations of complex disease associated loci. Overall, we explore the contributions ML has made towards reaching the GWAS end-game with consequent wide-ranging translational impact.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] ProDiGe: Prioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples
    Fantine Mordelet
    Jean-Philippe Vert
    [J]. BMC Bioinformatics, 12
  • [22] Machine learning approaches for neurological disease prediction: A systematic review
    Fatima, Ana
    Masood, Sarfaraz
    [J]. EXPERT SYSTEMS, 2024, 41 (09)
  • [23] Disease analysis using machine learning approaches in healthcare system
    Bhuyan, Hemanta Kumar
    Ravi, Vinayakumar
    Bramha, Biswajit
    Kamila, Nilayam Kumar
    [J]. HEALTH AND TECHNOLOGY, 2022, 12 (05) : 987 - 1005
  • [24] Exploration of machine learning approaches for automated crop disease detection
    Singla, Annu
    Nehra, Ashima
    Joshi, Kamaldeep
    Kumar, Ajit
    Tuteja, Narendra
    Varshney, Rajeev K.
    Gill, Sarvajeet Singh
    Gill, Ritu
    [J]. CURRENT PLANT BIOLOGY, 2024, 40
  • [25] Machine learning-based approaches for disease gene prediction
    Duc-Hau Le
    [J]. BRIEFINGS IN FUNCTIONAL GENOMICS, 2020, 19 (5-6) : 350 - 363
  • [26] Machine Learning Approaches for the Neuroimaging Study of Alzheimer's Disease
    Ye, Jieping
    Wu, Teresa
    Li, Jing
    Chen, Kewei
    [J]. COMPUTER, 2011, 44 (04) : 99 - 101
  • [27] Parkinson's Disease Prediction Using Machine Learning Approaches
    Gokul, S.
    Sivachitra, M.
    Vijayachitra, S.
    [J]. 2013 FIFTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2013, : 246 - 252
  • [28] Machine learning approaches for predicting biomolecule-disease associations
    Ding, Yulian
    Lei, Xiujuan
    Liao, Bo
    Wu, Fang-Xiang
    [J]. BRIEFINGS IN FUNCTIONAL GENOMICS, 2021, 20 (04) : 273 - 287
  • [29] A Comprehensive Review on Disease Predictions Using Machine Learning Approaches
    Wani, Suhail Rashid
    Attri, Shree Harsh
    Setia, Sonia
    [J]. FOURTH CONGRESS ON INTELLIGENT SYSTEMS, VOL 1, CIS 2023, 2024, 868 : 335 - 348
  • [30] Machine Learning approaches to classifying and predicting disease progression in Adrenomyeloneuropathy
    Turk, B.
    Fine, A.
    Fan, Y.
    Wei, J.
    Keller, J.
    Raymond, G.
    Unberath, M.
    Fatemi, A.
    [J]. ANNALS OF NEUROLOGY, 2022, 92 : S33 - S34