Genome-wide association studies of ischemic stroke based on interpretable machine learning

被引:0
|
作者
Nikoli, Stefan [1 ]
Ignatov, Dmitry I. [1 ]
Khvorykh, Gennady, V [2 ]
Limborska, Svetlana A. [2 ]
Khrunin, Andrey, V [2 ]
机构
[1] HSE Univ, Lab Models & Methods Computat Pragmat, Dept Data Anal & Artificial Intelligence, Moscow, Russia
[2] Natl Res Ctr Kurchatov Inst, Moscow, Russia
基金
俄罗斯科学基金会;
关键词
Genome-wide association studies; Interpretable machine learning; Ischemic stroke; Illuminating druggable genome; XGBoost; Interpretable neural network TabNet; SNP ranking; SNP importance; OXIDATIVE STRESS; DISEASE; RISK; GENE; PROTEINS; LOCI;
D O I
10.7717/peerj-cs.2454
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the identification of several dozen genetic loci associated with ischemic stroke (IS), the genetic bases of this disease remain largely unexplored. In this research we present the results of genome-wide association studies (GWAS) based on classical statistical testing and machine learning algorithms (logistic regression, gradient boosting on decision trees, and tabular deep learning model TabNet). To build a consensus on the results obtained by different techniques, the Pareto-Optimal solution was proposed and applied. These methods were applied to real genotypic data of sick and healthy individuals of European ancestry obtained from the Database of Genotypes and Phenotypes (5,581 individuals, 883,749 single nucleotide polymorphisms). Finally, 131 genes were identified as candidates for association with the onset of IS. UBQLN1, TRPS1, and MUSK were previously described as associated with the course of IS in model animals. ACOT11 taking part in metabolism of fatty acids was shown for the first time to be associated with IS. The identified genes were compared with genes from the Illuminating Druggable Genome project. The product of GPR26 representing the G-coupled protein receptor can be considered as a therapeutic target for stroke prevention. The approaches presented in this research can be used to reprocess GWAS datasets from other diseases.
引用
收藏
页数:26
相关论文
共 50 条
  • [22] Family-based genome-wide association studies
    Benyamin, B.
    Visscher, P. M.
    McRae, A. F.
    PHARMACOGENOMICS, 2009, 10 (02) : 181 - 190
  • [23] Learning Hierarchical Bayesian Networks for Genome-Wide Association Studies
    Mourad, Raphael
    Sinoquet, Christine
    Leray, Philippe
    COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 549 - 556
  • [24] GENOME-WIDE ASSOCIATION STUDIES Validating, augmenting and refining genome-wide association signals
    Ioannidis, John P. A.
    Thomas, Gilles
    Daly, Mark J.
    NATURE REVIEWS GENETICS, 2009, 10 (05) : 318 - 329
  • [25] Pulmonary Function: From Genome-Wide Association Studies to Genome-Wide Interaction Studies
    Christiani, David C.
    AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2019, 199 (05) : 557 - 559
  • [26] Genome-Wide Association Studies of Incident Total Stroke and Ischemic Stroke: Meta-Analysis and Replication from the CHARGE Consortium
    Ikram, Mohammad
    Aulchenko, Yurii
    van den Herik, Evita
    Bos, Michiel J.
    Struchalin, Maksim
    Rivadeneira, Fernando
    Hofman, Albert
    Koudstaal, Peter J.
    de Lau, Lonneke
    Oostra, Ben
    Uitterlinden, Andre
    Van Duijn, Cornelia
    Seshadri, Sudha
    DeStefano, Anita
    Beiser, Alexa
    Du, Yangchun
    Kelly-Hayes, Margaret
    Yang, Qiong
    Kase, Carlos S.
    Wolf, Philip A.
    Bis, Joshua
    Lumley, Thomas
    Glazer, Nicole
    Heckbert, Susan
    Smith, Nicholas
    Rice, Kenneth
    Psaty, Bruce
    Longstreth, W. T., Jr.
    Fornage, Myriam
    Boerwinkle, Eric
    Debette, Stephanie
    Folsom, Aaron
    Cushman, Mary
    Launer, Lenore J.
    Shahar, Eyal
    Rosamond, Wayne
    Lopez, Oscar L.
    Coresh, Josef
    DeCarli, Charles S.
    Haritunians, Talin
    Taylor, Kent
    Rotter, Jerome
    Roks, Gerwin
    de Kort, Paul
    Mosley, Thomas H.
    Breteler, Monique M. B.
    NEUROLOGY, 2009, 73 (04) : 330 - 331
  • [27] Identification of Shared Genes Between Ischemic Stroke and Parkinson's Disease Using Genome-Wide Association Studies
    Lang, Wenjing
    Wang, Junjie
    Ma, Xiaofeng
    Zhang, Nong
    Li, He
    Cui, Pan
    Hao, Junwei
    FRONTIERS IN NEUROLOGY, 2019, 10
  • [28] Genome-Wide Association Studies-Based Machine Learning for Prediction of Age-Related Macular Degeneration Risk
    Yan, Qi
    Jiang, Yale
    Huang, Heng
    Swaroop, Anand
    Chew, Emily Y.
    Weeks, Daniel E.
    Chen, Wei
    Ding, Ying
    TRANSLATIONAL VISION SCIENCE & TECHNOLOGY, 2021, 10 (02): : 1 - 8
  • [29] Machine-Learning-Based Genome-Wide Association Studies for Uncovering QTL Underlying Soybean Yield and Its Components
    Yoosefzadeh-Najafabadi, Mohsen
    Eskandari, Milad
    Torabi, Sepideh
    Torkamaneh, Davoud
    Tulpan, Dan
    Rajcan, Istvan
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2022, 23 (10)
  • [30] Genome-Wide Association Studies of Autism
    Glessner J.T.
    Connolly J.J.
    Hakonarson H.
    Current Behavioral Neuroscience Reports, 2014, 1 (4) : 234 - 241