Prioritization of risk genes for Alzheimer's disease: an analysis framework using spatial and temporal gene expression data in the human brain based on support vector machine

被引:4
|
作者
Wang, Shiyu [1 ]
Fang, Xixian [1 ]
Wen, Xiang [2 ]
Yang, Congying [1 ]
Yang, Ying [1 ]
Zhang, Tianxiao [1 ,3 ]
机构
[1] Xi An Jiao Tong Univ, Sch Publ Hlth, Dept Epidemiol & Biostat, Hlth Sci Ctr, Xian, Peoples R China
[2] Univ Chinese Acad Sci, Hangzhou Inst Adv Study, Beijing, Peoples R China
[3] Shaanxi Reg Ctr, Natl Antidrug Lab, Xian, Peoples R China
基金
中国国家自然科学基金;
关键词
Alzheimer's disease; risk gene prioritization; gene expression patterns; machine learning; genome-wide association analyses; REGULATORS;
D O I
10.3389/fgene.2023.1190863
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Background: Alzheimer's disease (AD) is a complex disorder, and its risk is influenced by multiple genetic and environmental factors. In this study, an AD risk gene prediction framework based on spatial and temporal features of gene expression data (STGE) was proposed.Methods: We proposed an AD risk gene prediction framework based on spatial and temporal features of gene expression data. The gene expression data of providers of different tissues and ages were used as model features. Human genes were classified as AD risk or non-risk sets based on information extracted from relevant databases. Support vector machine (SVM) models were constructed to capture the expression patterns of genes believed to contribute to the risk of AD.Results: The recursive feature elimination (RFE) method was utilized for feature selection. Data for 64 tissue-age features were obtained before feature selection, and this number was reduced to 19 after RFE was performed. The SVM models were built and evaluated using 19 selected and full features. The area under curve (AUC) values for the SVM model based on 19 selected features (0.740 [0.690-0.790]) and full feature sets (0.730 [0.678-0.769]) were very similar. Fifteen genes predicted to be risk genes for AD with a probability greater than 90% were obtained.Conclusion: The newly proposed framework performed comparably to previous prediction methods based on protein-protein interaction (PPI) network properties. A list of 15 candidate genes for AD risk was also generated to provide data support for further studies on the genetic etiology of AD.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Bioinformatics analysis of diagnostic biomarkers for Alzheimer's disease in peripheral blood based on sex differences and support vector machine algorithm
    Wencan Ji
    Ke An
    Canjun Wang
    Shaohua Wang
    Hereditas, 159
  • [22] Bioinformatics analysis of diagnostic biomarkers for Alzheimer's disease in peripheral blood based on sex differences and support vector machine algorithm
    Ji, Wencan
    An, Ke
    Wang, Canjun
    Wang, Shaohua
    HEREDITAS, 2022, 159 (01)
  • [23] Early diagnosis of Alzheimer's disease based on partial least squares, principal component analysis and support vector machine using segmented MRI images
    Khedher, L.
    Ramirez, J.
    Gorriz, J. M.
    Brahim, A.
    Segovia, F.
    NEUROCOMPUTING, 2015, 151 : 139 - 150
  • [24] Definition of genes and paths involved in Alzheimer's disease: Using gene expression profiles and chemical genetics at the mouse brain level
    Wu, Pu
    Hu, Yinghe
    CURRENT GENOMICS, 2006, 7 (05) : 293 - 300
  • [25] Alzheimer's Patient Analysis Using Image and Gene Expression Data and Explainable-AI to Present Associated Genes
    Kamal, Md. Sarwar
    Northcote, Aden
    Chowdhury, Linkon
    Dey, Nilanjan
    Gonzalez Crespo, Ruben
    Herrera-Viedma, Enrique
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [26] Random-Forest (RF) and Support Vector Machine (SVM) Implementation for Analysis of Gene Expression Data in Chronic Kidney Disease (CKD)
    Rustam, Zuherman
    Sudarsono, Ely
    Sarwinda, Devvi
    9TH ANNUAL BASIC SCIENCE INTERNATIONAL CONFERENCE 2019 (BASIC 2019), 2019, 546
  • [27] Integrating rare pathogenic variant prioritization with gene-based association analysis to identify novel genes and relevant multimodal traits for Alzheimer's disease
    Cao, Jixin
    Zhang, Cheng
    Lo, Chun-Yi Zac
    Guo, Qihao
    Ding, Jing
    Luo, Xiaohui
    Zhang, Zi-Chao
    Chen, Feng
    Cheng, Tian-Lin
    Chen, Jingqi
    Zhao, Xing-Ming
    ALZHEIMERS & DEMENTIA, 2025, 21 (02)
  • [28] Predictive Models Based on Support Vector Machines: Whole-Brain versus Regional Analysis of Structural MRI in the Alzheimer's Disease
    Retico, Alessandra
    Bosco, Paolo
    Cerello, Piergiorgio
    Fiorina, Elisa
    Chincarini, Andrea
    Fantacci, Maria Evelina
    JOURNAL OF NEUROIMAGING, 2015, 25 (04) : 552 - 563
  • [29] Predication of Parkinson's disease using Data Mining Methods: a comparative analysis of tree, statistical and support vector machine classifiers
    Yadav, Geeta
    Kumar, Yugal
    Sahoo, G.
    2012 NATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION SYSTEMS (NCCCS), 2012, : 21 - 28
  • [30] An Efficient WRF Framework for Discovering Risk Genes and Abnormal Brain Regions in Parkinson's Disease Based on Imaging Genetics Data
    Bi, Xia-An
    Xing, Zhao-Xu
    Xu, Rui-Hui
    Hu, Xi
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2021, 36 (02) : 361 - 374