Reaching the End-Game for GWAS: Machine Learning Approaches for the Prioritization of Complex Disease Loci

被引:71
|
作者
Nicholls, Hannah L. [1 ,2 ]
John, Christopher R. [2 ,3 ]
Watson, David S. [2 ,4 ]
Munroe, Patricia B. [1 ,5 ]
Barnes, Michael R. [1 ,2 ,5 ,6 ]
Cabrera, Claudia P. [1 ,2 ,5 ]
机构
[1] Queen Mary Univ London, Barts & London Sch Med & Dent, William Harvey Res Inst, Clin Pharmacol, London, England
[2] Queen Mary Univ London, Barts & London Sch Med & Dent, William Harvey Res Inst, Ctr Translat Bioinformat, London, England
[3] Queen Mary Univ London, Barts & London Sch Med & Dent, William Harvey Res Inst, Ctr Expt Med & Rheumatol, London, England
[4] Univ Oxford, Oxford Internet Inst, Oxford, England
[5] Queen Mary Univ London, Barts & London Sch Med & Dent, NIHR Barts Biomed Res Ctr, London, England
[6] Alan Turing Inst, British Lib, London, England
关键词
machine learning; artificial intelligence; genome-wide association study; genomics; candidate gene; clinical translation; deep learning; data science; GENOME-WIDE ASSOCIATION; VARIABLE SELECTION; GENE; RISK; SCHIZOPHRENIA; IMPACT;
D O I
10.3389/fgene.2020.00350
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Genome-wide association studies (GWAS) have revealed thousands of genetic loci that underpin the complex biology of many human traits. However, the strength of GWAS - the ability to detect genetic association by linkage disequilibrium (LD) - is also its limitation. Whilst the ever-increasing study size and improved design have augmented the power of GWAS to detect effects, differentiation of causal variants or genes from other highly correlated genes associated by LD remains the real challenge. This has severely hindered the biological insights and clinical translation of GWAS findings. Although thousands of disease susceptibility loci have been reported, causal genes at these loci remain elusive. Machine learning (ML) techniques offer an opportunity to dissect the heterogeneity of variant and gene signals in the post-GWAS analysis phase. ML models for GWAS prioritization vary greatly in their complexity, ranging from relatively simple logistic regression approaches to more complex ensemble models such as random forests and gradient boosting, as well as deep learning models, i.e., neural networks. Paired with functional validation, these methods show important promise for clinical translation, providing a strong evidence-based approach to direct post-GWAS research. However, as ML approaches continue to evolve to meet the challenge of causal gene identification, a critical assessment of the underlying methodologies and their applicability to the GWAS prioritization problem is needed. This review investigates the landscape of ML applications in three parts: selected models, input features, and output model performance, with a focus on prioritizations of complex disease associated loci. Overall, we explore the contributions ML has made towards reaching the GWAS end-game with consequent wide-ranging translational impact.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] The Diagnosis of Dengue Disease: An Evaluation of Three Machine Learning Approaches
    Gambhir, Shalini
    Malik, Sanjay Kumar
    Kumar, Yugal
    [J]. INTERNATIONAL JOURNAL OF HEALTHCARE INFORMATION SYSTEMS AND INFORMATICS, 2018, 13 (03) : 1 - 19
  • [32] Novel Alzheimer's disease genes and epistasis identified using machine learning GWAS platform
    Lundberg, Mischa
    Sng, Letitia M. F.
    Szul, Piotr
    Dunne, Rob
    Bayat, Arash
    Burnham, Samantha C.
    Bauer, Denis C.
    Twine, Natalie A.
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [33] Novel Alzheimer’s disease genes and epistasis identified using machine learning GWAS platform
    Mischa Lundberg
    Letitia M. F. Sng
    Piotr Szul
    Rob Dunne
    Arash Bayat
    Samantha C. Burnham
    Denis C. Bauer
    Natalie A. Twine
    [J]. Scientific Reports, 13
  • [34] Machine learning and Serious Game for the Early Diagnosis of Alzheimer's Disease
    Mezrar, Samiha
    Bendella, Fatima
    [J]. SIMULATION & GAMING, 2022, 53 (04) : 369 - 387
  • [35] Machine learning to predict end stage kidney disease in chronic kidney disease
    Bai, Qiong
    Su, Chunyan
    Tang, Wen
    Li, Yike
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [36] Machine learning to predict end stage kidney disease in chronic kidney disease
    Qiong Bai
    Chunyan Su
    Wen Tang
    Yike Li
    [J]. Scientific Reports, 12
  • [37] Machine Learning and Deep Learning Approaches for Brain Disease Diagnosis: Principles and Recent Advances
    Khan, Protima
    Kader, Md. Fazlul
    Islam, S. M. Riazul
    Rahman, Aisha B.
    Kamal, Md. Shahriar
    Toha, Masbah Uddin
    Kwak, Kyung-Sup
    [J]. IEEE ACCESS, 2021, 9 : 37622 - 37655
  • [38] A comprehensive review on detection of plant disease using machine learning and deep learning approaches
    Jackulin, C.
    Murugavalli, S.
    [J]. Measurement: Sensors, 2022, 24
  • [39] Deep learning in modeling protein complex structures: From contact prediction to end-to-end approaches
    Lin, Peicong
    Li, Hao
    Huang, Sheng-You
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 2024, 85
  • [40] Detecting IoT botnets based on the combination of cooperative game theory with deep and machine learning approaches
    Asadi, Mehdi
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 13 (12) : 5547 - 5561