Pathogenicity Prediction of Single Amino Acid Variants With Machine Learning Model Based on Protein Structural Energies

被引:1
|
作者
Wu, Tzu-Hsuan [1 ]
Lin, Peng-Chan [2 ]
Chou, Hsin-Hung [3 ]
Shen, Meng-Ru [4 ]
Hsieh, Sun-Yuan [1 ,5 ]
机构
[1] Natl Cheng Kung Univ, Inst Med Informat, Tainan 701, Taiwan
[2] Natl Cheng Kung Univ Hosp, Dept Comp Sci & Informat Engn, Dept Internal Med, Tainan 704, Taiwan
[3] Natl Chi Nan Univ, Dept Comp Sci & Informat Engn, Puli Township 54516, Nantou County, Taiwan
[4] Natl Cheng Kung Univ, Dept Obstet & Gynecol, Dept Pharmacol, Coll Med, Tainan 701, Taiwan
[5] Natl Cheng Kung Univ, Inst Mfg Informat Syst, Dept Comp Sci & Informat Engn, Tainan 701, Taiwan
关键词
Machine learning; pathogenicity prediction; protein structure energy; single amino acid variants; SNP; MUTATIONS; POLYMORPHISMS;
D O I
10.1109/TCBB.2021.3139048
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The most popular tools for predicting pathogenicity of single amino acid variants (SAVs) were developed based on sequence-based techniques. SAVs may change protein structure and function. In the context of van derWaals force and disulfide bridge calculations, no method directly predicts the impact of mutations on the energies of the protein structure. Here, we combined machine learning methods and energy scores of protein structures calculated by Rosetta Energy Function 2015 to predict SAV pathogenicity. The accuracy level of our model (0.76) is higher than that of six prediction tools. Further analyses revealed that the differential reference energies, attractive energies, and solvation of polar atoms between wildtype and mutant side-chains played essential roles in distinguishing benign from pathogenic variants. These features indicated the physicochemical properties of amino acids, which were observed in 3D structures instead of sequences. We added 16 features to Rhapsody (the prediction tool we used for our data set) and consequently improved its performance. The results indicated that these energy scores were more appropriate and more detailed representations of the pathogenicity of SAVs.
引用
收藏
页码:606 / 615
页数:10
相关论文
共 50 条
  • [21] Graph-based machine learning model for weight prediction in protein-protein networks
    Akid, Hajer
    Chennen, Kirsley
    Frey, Gabriel
    Thompson, Julie
    Ben Ayed, Mounir
    Lachiche, Nicolas
    BMC BIOINFORMATICS, 2024, 25 (01):
  • [22] Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants
    Khandakji, Mohannad N. N.
    Mifsud, Borbala
    FRONTIERS IN GENETICS, 2022, 13
  • [23] MLB-LDLR: A MACHINE LEARNING MODEL FOR PREDICTING THE PATHOGENICITY OF LDL RECEPTOR MISSENSE VARIANTS
    Larrea, A.
    Jebari-Benslaiman, S.
    Galicia, U.
    Benito, A.
    Arrasate, S.
    Cenarro, A.
    Civeira, F.
    Gonzalez, H.
    ATHEROSCLEROSIS, 2021, 331 : E3 - E3
  • [24] PPVED: A machine learning tool for predicting the effect of single amino acid substitution on protein function in plants
    Gou, Xiangjian
    Feng, Xuanjun
    Shi, Haoran
    Guo, Tingting
    Xie, Rongqian
    Liu, Yaxi
    Wang, Qi
    Li, Hongxiang
    Yang, Banglie
    Chen, Lixue
    Lu, Yanli
    PLANT BIOTECHNOLOGY JOURNAL, 2022, 20 (07) : 1417 - 1431
  • [25] Machine learning-based prediction of proteins' architecture using sequences of amino acids and structural alphabets
    Abbass, Jad
    Parisi, Charles
    JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2024,
  • [26] A Single Amino Acid in the Polymerase Acidic Protein Determines the Pathogenicity of Influenza B Viruses
    Bae, Joon-Yong
    Lee, Ilseob
    Kim, Jin Il
    Park, Sehee
    Yoo, Kirim
    Park, Miso
    Kim, Gayeong
    Park, Mee Sook
    Lee, Joo-Yeon
    Kang, Chun
    Kim, Kisoon
    Park, Man-Seong
    JOURNAL OF VIROLOGY, 2018, 92 (13)
  • [27] PREDICTION OF PROTEIN STRUCTURAL CLASS FROM THE AMINO-ACID-SEQUENCE
    KLEIN, P
    DELISI, C
    BIOPOLYMERS, 1986, 25 (09) : 1659 - 1672
  • [28] Identify protein disorder from amino acid sequences with Machine learning
    Iyer, Shrinath
    2021 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY (EIT), 2021, : 429 - 436
  • [29] An effective machine learning-based model for the prediction of protein–protein interaction sites in health systems
    Muhammad Tahir
    Fazlullah Khan
    Maqsood Hayat
    Mohammad Dahman Alshehri
    Neural Computing and Applications, 2024, 36 : 65 - 75
  • [30] Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools
    Jia, Lei
    Yarlagadda, Ramya
    Reed, Charles C.
    PLOS ONE, 2015, 10 (09):