Pathogenicity Prediction of Single Amino Acid Variants With Machine Learning Model Based on Protein Structural Energies

被引:1
|
作者
Wu, Tzu-Hsuan [1 ]
Lin, Peng-Chan [2 ]
Chou, Hsin-Hung [3 ]
Shen, Meng-Ru [4 ]
Hsieh, Sun-Yuan [1 ,5 ]
机构
[1] Natl Cheng Kung Univ, Inst Med Informat, Tainan 701, Taiwan
[2] Natl Cheng Kung Univ Hosp, Dept Comp Sci & Informat Engn, Dept Internal Med, Tainan 704, Taiwan
[3] Natl Chi Nan Univ, Dept Comp Sci & Informat Engn, Puli Township 54516, Nantou County, Taiwan
[4] Natl Cheng Kung Univ, Dept Obstet & Gynecol, Dept Pharmacol, Coll Med, Tainan 701, Taiwan
[5] Natl Cheng Kung Univ, Inst Mfg Informat Syst, Dept Comp Sci & Informat Engn, Tainan 701, Taiwan
关键词
Machine learning; pathogenicity prediction; protein structure energy; single amino acid variants; SNP; MUTATIONS; POLYMORPHISMS;
D O I
10.1109/TCBB.2021.3139048
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The most popular tools for predicting pathogenicity of single amino acid variants (SAVs) were developed based on sequence-based techniques. SAVs may change protein structure and function. In the context of van derWaals force and disulfide bridge calculations, no method directly predicts the impact of mutations on the energies of the protein structure. Here, we combined machine learning methods and energy scores of protein structures calculated by Rosetta Energy Function 2015 to predict SAV pathogenicity. The accuracy level of our model (0.76) is higher than that of six prediction tools. Further analyses revealed that the differential reference energies, attractive energies, and solvation of polar atoms between wildtype and mutant side-chains played essential roles in distinguishing benign from pathogenic variants. These features indicated the physicochemical properties of amino acids, which were observed in 3D structures instead of sequences. We added 16 features to Rhapsody (the prediction tool we used for our data set) and consequently improved its performance. The results indicated that these energy scores were more appropriate and more detailed representations of the pathogenicity of SAVs.
引用
收藏
页码:606 / 615
页数:10
相关论文
共 50 条
  • [1] DTreePred: an online viewer based on machine learning for pathogenicity prediction of genomic variants
    Daniel Henrique Ferreira Gomes
    Inácio Gomes Medeiros
    Tirzah Braz Petta
    Beatriz Stransky
    Jorge Estefano Santana de Souza
    BMC Bioinformatics, 26 (1)
  • [2] SVFX: a machine learning framework to quantify the pathogenicity of structural variants
    Sushant Kumar
    Arif Harmanci
    Jagath Vytheeswaran
    Mark B. Gerstein
    Genome Biology, 21
  • [3] SVFX: a machine learning framework to quantify the pathogenicity of structural variants
    Kumar, Sushant
    Harmanci, Arif
    Vytheeswaran, Jagath
    Gerstein, Mark B.
    GENOME BIOLOGY, 2020, 21 (01) : 274
  • [4] Accurate prediction of functional effect of single amino acid variants with deep learning
    Derbel, Houssemeddine
    Zhao, Zhongming
    Liu, Qian
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 5776 - 5784
  • [5] LYRUS: a machine learning model for predicting the pathogenicity of missense variants
    Lai, Jiaying
    Yang, Jordan
    Gamsiz Uzun, Ece D.
    Rubenstein, Brenda M.
    Sarkar, Indra Neil
    BIOINFORMATICS ADVANCES, 2022, 2 (01):
  • [6] Machine learning prediction of amino acid patterns in protein N-myristoylation
    Okada, Ryo
    Sugii, Manabu
    Matsuno, Hiroshi
    Miyano, Satoru
    PATTERN RECOGNITION IN BIOINFORMATICS, PROCEEDINGS, 2006, 4146 : 4 - +
  • [7] Weighted Amino Acid Composition based on Amino Acid Indices for Prediction of Protein Structural Classes
    Nanuwa, Sundeep Singh
    Dziurla, Andre
    Seker, Huseyin
    2009 9TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS IN BIOMEDICINE, 2009, : 583 - 586
  • [8] Prediction of protein structural classes based on correlations of amino acid residues
    Wang, SQ
    Liu, H
    Du, QS
    Wei, DQ
    ACTA PHYSICO-CHIMICA SINICA, 2004, 20 (05) : 498 - 502
  • [9] SHINE: protein language model -based pathogenicity prediction for short inframe insertion and deletion variants
    Fan, Xiao
    Pan, Hongbing
    Tian, Alan
    Chung, Wendy K.
    Shen, Yufeng
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)
  • [10] InMeRF: prediction of pathogenicity of missense variants by individual modeling for each amino acid substitution
    Takeda, Jun-ichi
    Nanatsue, Kentaro
    Yamagishi, Ryosuke
    Ito, Mikako
    Haga, Nobuhiko
    Hirata, Hiromi
    Ogi, Tomoo
    Ohno, Kinji
    NAR GENOMICS AND BIOINFORMATICS, 2020, 2 (02)