A new disease-specific machine learning approach for the prediction of cancer-causing missense variants

被引:62
|
作者
Capriotti, Emidio [1 ,3 ]
Altman, Russ B. [2 ]
机构
[1] Stanford Univ, Dept Bioengn, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
[3] Univ Balearic Isl, Dept Math & Comp Sci, Palma De Mallorca, Spain
关键词
Single Nucleotide Polymorphisms; Cancer-causing variants; Gene Ontology; Machine-learning; Support Vector Machine; SINGLE-NUCLEOTIDE POLYMORPHISMS; NON-SYNONYMOUS SNPS; PROTEIN MUTATIONS; SOMATIC MUTATIONS; GENE ONTOLOGY; HUMAN BREAST; ANNOTATION; SEQUENCE; DATABASE; TOOL;
D O I
10.1016/j.ygeno.2011.06.010
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
High-throughput genotyping and sequencing techniques are rapidly and inexpensively providing large amounts of human genetic variation data. Single Nucleotide Polymorphisms (SNPs) are an important source of human genome variability and have been implicated in several human diseases, including cancer. Amino acid mutations resulting from non-synonymous SNPs in coding regions may generate protein functional changes that affect cell proliferation. In this study, we developed a machine learning approach to predict cancer-causing missense variants. We present a Support Vector Machine (SVM) classifier trained on a set of 3163 cancer-causing variants and an equal number of neutral polymorphisms. The method achieve 93% overall accuracy, a correlation coefficient of 0.86, and area under ROC curve of 0.98. When compared with other previously developed algorithms such as SIFT and CHASM our method results in higher prediction accuracy and correlation coefficient in identifying cancer-causing variants. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:310 / 317
页数:8
相关论文
共 50 条
  • [1] PdmIRD: missense variants pathogenicity prediction for inherited retinal diseases in a disease-specific manner
    Bing Zeng
    Dong Cheng Liu
    Jian Guo Huang
    Xiao Bo Xia
    Bo Qin
    [J]. Human Genetics, 2024, 143 : 331 - 342
  • [2] PdmIRD: missense variants pathogenicity prediction for inherited retinal diseases in a disease-specific manner
    Zeng, Bing
    Liu, Dong Cheng
    Huang, Jian Guo
    Xia, Xiao Bo
    Qin, Bo
    [J]. HUMAN GENETICS, 2024, 143 (03) : 331 - 342
  • [3] Prediction of Cancer Disease using Machine learning Approach
    Shaikh, F. J.
    Rao, D. S.
    [J]. MATERIALS TODAY-PROCEEDINGS, 2022, 50 : 40 - 47
  • [4] A comparison of general and disease-specific machine learning models for the prediction of unplanned hospital readmissions
    Sutter, Thomas
    Roth, Jan A.
    Chin-Cheong, Kieran
    Hug, Balthasar L.
    Vogt, Julia E.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (04) : 868 - 873
  • [5] Prediction of Human Disease-specific Phosphorylation Sites with Combined Feature Selection Approach and Support Vector Machine
    Xu, Xiaoyi
    Li, Ao
    Wang, Minghui
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [6] Gene-specific machine learning for pathogenicity prediction of rare BRCA1 and BRCA2 missense variants
    Kang, Moonjong
    Kim, Seonhwa
    Lee, Da-Bin
    Hong, Changbum
    Hwang, Kyu-Baek
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [7] Gene-specific machine learning for pathogenicity prediction of rare BRCA1 and BRCA2 missense variants
    Moonjong Kang
    Seonhwa Kim
    Da-Bin Lee
    Changbum Hong
    Kyu-Baek Hwang
    [J]. Scientific Reports, 13
  • [8] New approach of prediction of recurrence in thyroid cancer patients using machine learning
    Kim, Soo Young
    Kim, Young-Il
    Kim, Hee Jun
    Chang, Hojin
    Kim, Seok-Mo
    Lee, Yong Sang
    Kwon, Soon-Sun
    Shin, Hyunjung
    Chang, Hang-Seok
    Park, Cheong Soo
    [J]. MEDICINE, 2021, 100 (42) : E27493
  • [9] An Approach with Machine Learning for Heart Disease Risk Prediction
    Jeribi, Fathe
    Kaur, Chamandeep
    Pawar, A. B.
    [J]. 2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 1474 - 1479
  • [10] Prediction of driver variants in the cancer genome via machine learning methodologies
    Rogers, Mark F.
    Gaunt, Tom R.
    Campbell, Colin
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)