Support vector machine model of developmental brain gene expression data for prioritization of Autism risk gene candidates

被引:49
|
作者
Cogill, S. [1 ]
Wang, L. [1 ]
机构
[1] Clemson Univ, Dept Biochem & Genet, Clemson, SC 29634 USA
关键词
LONG NONCODING RNAS; SPECTRUM DISORDERS; PREDICTION; KNOWLEDGEBASE; IMPLICATE; EVOLUTION; CHILDREN; INSIGHTS; GENCODE; DNA;
D O I
10.1093/bioinformatics/btw498
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Autism spectrum disorders (ASD) are a group of neurodevelopmental disorders with clinical heterogeneity and a substantial polygenic component. High-throughput methods for ASD risk gene identification produce numerous candidate genes that are time-consuming and expensive to validate. Prioritization methods can identify high-confidence candidates. Previous ASD gene prioritization methods have focused on a priori knowledge, which excludes genes with little functional annotation or no protein product such as long non-coding RNAs (lncRNAs). Results: We have developed a support vector machine (SVM) model, trained using brain developmental gene expression data, for the classification and prioritization of ASD risk genes. The selected feature model had a mean accuracy of 76.7%, mean specificity of 77.2% and mean sensitivity of 74.4%. Gene lists comprised of an ASD risk gene and adjacent genes were ranked using the model's decision function output. The known ASD risk genes were ranked on average in the 77.4th, 78.4th and 80.7th percentile for sets of 101, 201 and 401 genes respectively. Of 10,840 lncRNA genes, 63 were classified as ASD-associated candidates with a confidence greater than 0.95. Genes previously associated with brain development and neurodevelopmental disorders were prioritized highly within the lncRNA gene list.
引用
收藏
页码:3611 / 3618
页数:8
相关论文
共 50 条
  • [21] Risk gene identification and support vector machine learning to construct an early diagnosis model of myocardial infarction
    Fang, Hong-Zhi
    Hu, Dan-Li
    Li, Qin
    Tu, Su
    MOLECULAR MEDICINE REPORTS, 2020, 22 (03) : 1775 - 1782
  • [22] A note on classification of gene expression data using support vector machines
    Fujarewicz, K
    Kimmel, M
    Rzeszowska-Wolny, J
    Swierniak, A
    JOURNAL OF BIOLOGICAL SYSTEMS, 2003, 11 (01) : 43 - 56
  • [23] Bagged ensembles of Support Vector Machines for gene expression data analysis
    Valentini, G
    Muselli, M
    Ruffino, F
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 1844 - 1849
  • [24] A cascading Support Vector Machines system for gene expression data classification
    Iakovidis, DK
    Flaounas, IN
    Karkanis, SA
    Maroulis, DE
    2004 2ND INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, PROCEEDINGS, 2004, : 344 - 347
  • [25] Transductive Support Vector Machines for classification of microarray gene expression data
    Semolini, R
    Von Zuben, FJ
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2946 - 2951
  • [26] A support vector machine approach for detecting gene-gene interaction
    Chen, Shyh-Huei
    Sun, Jielin
    Dimitrov, Latchezar
    Turner, Aubrey R.
    Adams, Tamara S.
    Meyers, Deborah A.
    Chang, Bao-Li
    Zheng, S. Lilly
    Groenberg, Henrik
    Xu, Jianfeng
    Hsu, Fang-Chi
    GENETIC EPIDEMIOLOGY, 2008, 32 (02) : 152 - 167
  • [27] A Unified Model for Support Vector Machine and Support Vector Data Description
    Le, Trung
    Tran, Dat
    Ma, Wanli
    Sharma, Dharmendra
    2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [28] Developmental gene regulatory network connections predicted by machine learning from gene expression data alone
    Zhang, Jingyi
    Ibrahim, Farhan
    Najmulski, Emily
    Katholos, George
    Altarawy, Doaa
    Heath, Lenwood S.
    Tulin, Sarah L.
    PLOS ONE, 2021, 16 (12):
  • [29] The Relative Transcription Index: A gene expression based metric for prioritization of drug candidates
    Ghosh, Sujoy
    Watson, Mike A.
    Collins, Jon L.
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2007, 10 (04) : 239 - 245
  • [30] A novel machine learning model to predict autism spectrum disorders risk gene
    Murat Gök
    Neural Computing and Applications, 2019, 31 : 6711 - 6717