A novel structure-based encoding for machine-learning applied to the inference of SH3 domain specificity

被引:21
|
作者
Ferraro, E. [1 ]
Via, A. [1 ]
Ausiello, G. [1 ]
Helmer-Citterich, M. [1 ]
机构
[1] Univ Tor Vergata, Dept Biol, Ctr Mol Bioinformat, Rome, Italy
关键词
D O I
10.1093/bioinformatics/btl403
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Unravelling the rules underlying protein-protein and protein-ligand interactions is a crucial step in understanding cell machinery. Peptide recognition modules (PRMs) are globular protein domains which focus their binding targets on short protein sequences and play a key role in the frame of protein-protein interactions. High-throughput techniques permit the whole proteome scanning of each domain, but they are characterized by a high incidence of false positives. In this context, there is a pressing need for the development of in silico experiments to validate experimental results and of computational tools for the inference of domain-peptide interactions. Results: We focused on the SH3 domain family and developed a machine-learning approach for inferring interaction specificity. SH3 domains are well-studied PRMs which typically bind proline-rich short sequences characterized by the PxxP consensus. The binding information is known to be held in the conformation of the domain surface and in the short sequence of the peptide. Our method relies on interaction data from high-throughput techniques and benefits from the integration of sequence and structure data of the interacting partners. Here, we propose a novel encoding technique aimed at representing binding information on the basis of the domain-peptide contact residues in complexes of known structure. Remarkably, the new encoding requires few variables to represent an interaction, thus avoiding the 'curse of dimension'. Our results display an accuracy > 90% in detecting new binders of known SH3 domains, thus outperforming neural models on standard binary encodings, profile methods and recent statistical predictors. The method, moreover, shows a generalization capability, inferring specificity of unknown SH3 domains displaying some degree of similarity with the known data.
引用
收藏
页码:2333 / 2339
页数:7
相关论文
共 50 条
  • [31] The Free Energy Contribution of SH3 and SH2 in c-Abl 1b Autoinhibition Mechanism via a Computational Structure-Based Model
    Mereu, Ilaria
    Sutto, Ludovico
    Gervasio, Francesco L.
    [J]. BIOPHYSICAL JOURNAL, 2014, 106 (02) : 253A - 254A
  • [32] SH3GL2 and MMP17 as lung adenocarcinoma biomarkers: a machine-learning based approach
    Tian, Zengjian
    Yu, Shilong
    Cai, Ruizhi
    Zhang, Yinghui
    Liu, Qilun
    Zhu, Yongzhao
    [J]. BIOCHEMISTRY AND BIOPHYSICS REPORTS, 2024, 38
  • [33] Structure-based design of a novel series of nonpeptide ligands that bind to the pp60src SH2 domain
    Lunney, EA
    Para, KS
    Rubin, JR
    Humblet, C
    Fergus, JH
    Marks, JS
    Sawyer, TK
    [J]. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1997, 119 (51) : 12471 - 12476
  • [34] Discovery of novel dual acetylcholinesterase and butyrylcholinesterase inhibitors using machine learning and structure-based drug design
    Tripathi, Manish Kumar
    Bhardwaj, Bhagwati
    Waiker, Digambar Kumar
    Tripathi, Avanish
    Shrivastava, Sushant Kumar
    [J]. JOURNAL OF MOLECULAR STRUCTURE, 2023, 1286
  • [35] PharmRF: A machine-learning scoring function to identify the best protein-ligand complexes for structure-based pharmacophore screening with high enrichments
    Kumar, Sivakumar Prasanth
    Dixit, Nandan Y.
    Patel, Chirag N.
    Rawal, Rakesh M.
    Pandya, Himanshu A.
    [J]. JOURNAL OF COMPUTATIONAL CHEMISTRY, 2022, 43 (12) : 847 - 863
  • [36] Novel STAT3 small-molecule inhibitors identified by structure-based virtual ligand screening incorporating SH2 domain flexibility
    Kong, Ren
    Bharadwaj, Uddalak
    Eckols, T. Kris
    Kolosov, Mikhail
    Wu, Haoyi
    Cruz-Pavlovich, Francisco J. Santa
    Shaw, Alison
    Ifelayo, Oluwatomilona I.
    Zhao, Hong
    Kasembeli, Moses M.
    Wong, Stephen T. C.
    Tweardy, David J.
    [J]. PHARMACOLOGICAL RESEARCH, 2021, 169
  • [37] Machine learning prediction of activation energy in cubic Li-argyrodites with hierarchically encoding crystal structure-based (HECS) descriptors
    Zhao, Qian
    Avdeev, Maxim
    Chen, Liquan
    Shi, Siqi
    [J]. SCIENCE BULLETIN, 2021, 66 (14) : 1401 - 1408
  • [38] Mutations in a gene encoding a novel SH3/TPR domain protein cause autosomal recessive Charcot-Marie-Tooth type 4C neuropathy
    Senderek, J
    Bergmann, C
    Stendel, C
    Kirfel, J
    Verpoorten, N
    De Jonghe, P
    Timmerman, V
    Chrast, R
    Verheijen, MHG
    Lemke, G
    Battaloglu, E
    Parman, Y
    Erdem, S
    Tan, E
    Topaloglu, H
    Hahn, A
    Müller-Felber, W
    Rizzuto, N
    Fabrizi, GM
    Stuhrmann, M
    Rudnik-Schöneborn, S
    Züchner, S
    Schröder, JM
    Buchheim, E
    Straub, V
    Klepper, JR
    Huehne, K
    Rautenstrauss, B
    Büttner, R
    Nelis, E
    Zerres, K
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2003, 73 (05) : 1106 - 1119
  • [39] Incorporating Domain Knowledge and Structure-Based Descriptors for Machine Learning: A Case Study of Pd-Catalyzed Sonogashira Reactions
    Chan, Kalok
    Ta, Long Thanh
    Huang, Yong
    Su, Haibin
    Lin, Zhenyang
    [J]. MOLECULES, 2023, 28 (12):
  • [40] Deep learning in the 3rd dimension: Structure-based bioactivity prediction on novel targets
    Heifets, Abraham
    Wallach, Izhar
    Dzamba, Michael
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2016, 251