Support vector machine model of developmental brain gene expression data for prioritization of Autism risk gene candidates

被引:49
|
作者
Cogill, S. [1 ]
Wang, L. [1 ]
机构
[1] Clemson Univ, Dept Biochem & Genet, Clemson, SC 29634 USA
关键词
LONG NONCODING RNAS; SPECTRUM DISORDERS; PREDICTION; KNOWLEDGEBASE; IMPLICATE; EVOLUTION; CHILDREN; INSIGHTS; GENCODE; DNA;
D O I
10.1093/bioinformatics/btw498
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Autism spectrum disorders (ASD) are a group of neurodevelopmental disorders with clinical heterogeneity and a substantial polygenic component. High-throughput methods for ASD risk gene identification produce numerous candidate genes that are time-consuming and expensive to validate. Prioritization methods can identify high-confidence candidates. Previous ASD gene prioritization methods have focused on a priori knowledge, which excludes genes with little functional annotation or no protein product such as long non-coding RNAs (lncRNAs). Results: We have developed a support vector machine (SVM) model, trained using brain developmental gene expression data, for the classification and prioritization of ASD risk genes. The selected feature model had a mean accuracy of 76.7%, mean specificity of 77.2% and mean sensitivity of 74.4%. Gene lists comprised of an ASD risk gene and adjacent genes were ranked using the model's decision function output. The known ASD risk genes were ranked on average in the 77.4th, 78.4th and 80.7th percentile for sets of 101, 201 and 401 genes respectively. Of 10,840 lncRNA genes, 63 were classified as ASD-associated candidates with a confidence greater than 0.95. Genes previously associated with brain development and neurodevelopmental disorders were prioritized highly within the lncRNA gene list.
引用
收藏
页码:3611 / 3618
页数:8
相关论文
共 50 条
  • [1] Prioritization of risk genes for Alzheimer's disease: an analysis framework using spatial and temporal gene expression data in the human brain based on support vector machine
    Wang, Shiyu
    Fang, Xixian
    Wen, Xiang
    Yang, Congying
    Yang, Ying
    Zhang, Tianxiao
    FRONTIERS IN GENETICS, 2023, 14
  • [2] Seeking gene relationships in gene expression data using support vector machine regression
    Robert Yu
    Kevin DeHoff
    Christopher I Amos
    Sanjay Shete
    BMC Proceedings, 1 (Suppl 1)
  • [3] Normalization of gene expression data using support vector machine approach
    Shil, Sandip
    Das, Kishore K.
    Sarkar, Ananta
    ELECTRONIC JOURNAL OF APPLIED STATISTICAL ANALYSIS, 2016, 9 (01) : 95 - 110
  • [4] A support vector machine ensemble for cancer classification using gene expression data
    Liao, Chen
    Li, Shutao
    BIOINFORMATICS RESEARCH AND APPLICATIONS, PROCEEDINGS, 2007, 4463 : 488 - +
  • [5] Classification of gene functions using support vector machine for time-course gene expression data
    Park, Changyi
    Koo, Ja-Yong
    Kim, Sujong
    Sohn, Insuk
    Lee, Jae Won
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (05) : 2578 - 2587
  • [6] Impact of Feature Selection on Support Vector Machine Using Microarray Gene Expression Data
    Wahid, Choudhury Muhammad Mufassil
    Ali, A. B. M. Shawkat
    Tickle, Kevin
    2009 SECOND INTERNATIONAL CONFERENCE ON MACHINE VISION, PROCEEDINGS, ( ICMV 2009), 2009, : 189 - 193
  • [7] Active learning with support vector machine applied to gene expression data for cancer classification
    Liu, Y
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (06): : 1936 - 1941
  • [8] Gene Expression Data Classification using Support Vector Machine and Mutual Information-based Gene Selection
    Vanitha, Devi Arockia C.
    Devaraj, D.
    Venkatesulu, M.
    GRAPH ALGORITHMS, HIGH PERFORMANCE IMPLEMENTATIONS AND ITS APPLICATIONS (ICGHIA 2014), 2015, 47 : 13 - 21
  • [9] Towards improving fuzzy clustering using support vector machine: Application to gene expression data
    Mukhopadhyay, Anirban
    Maulik, Ujjwal
    PATTERN RECOGNITION, 2009, 42 (11) : 2744 - 2763
  • [10] Multidimensional support vector machines for visualization of gene expression data
    Komura, D
    Nakamura, H
    Tsutsumi, S
    Aburatani, H
    Ihara, S
    BIOINFORMATICS, 2005, 21 (04) : 439 - 444