A novel structure-based encoding for machine-learning applied to the inference of SH3 domain specificity

被引:21
|
作者
Ferraro, E. [1 ]
Via, A. [1 ]
Ausiello, G. [1 ]
Helmer-Citterich, M. [1 ]
机构
[1] Univ Tor Vergata, Dept Biol, Ctr Mol Bioinformat, Rome, Italy
关键词
D O I
10.1093/bioinformatics/btl403
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Unravelling the rules underlying protein-protein and protein-ligand interactions is a crucial step in understanding cell machinery. Peptide recognition modules (PRMs) are globular protein domains which focus their binding targets on short protein sequences and play a key role in the frame of protein-protein interactions. High-throughput techniques permit the whole proteome scanning of each domain, but they are characterized by a high incidence of false positives. In this context, there is a pressing need for the development of in silico experiments to validate experimental results and of computational tools for the inference of domain-peptide interactions. Results: We focused on the SH3 domain family and developed a machine-learning approach for inferring interaction specificity. SH3 domains are well-studied PRMs which typically bind proline-rich short sequences characterized by the PxxP consensus. The binding information is known to be held in the conformation of the domain surface and in the short sequence of the peptide. Our method relies on interaction data from high-throughput techniques and benefits from the integration of sequence and structure data of the interacting partners. Here, we propose a novel encoding technique aimed at representing binding information on the basis of the domain-peptide contact residues in complexes of known structure. Remarkably, the new encoding requires few variables to represent an interaction, thus avoiding the 'curse of dimension'. Our results display an accuracy > 90% in detecting new binders of known SH3 domains, thus outperforming neural models on standard binary encodings, profile methods and recent statistical predictors. The method, moreover, shows a generalization capability, inferring specificity of unknown SH3 domains displaying some degree of similarity with the known data.
引用
收藏
页码:2333 / 2339
页数:7
相关论文
共 50 条
  • [1] A neural strategy for the inference of SH3 domain-peptide interaction specificity
    Enrico Ferraro
    Allegra Via
    Gabriele Ausiello
    Manuela Helmer-Citterich
    [J]. BMC Bioinformatics, 6
  • [2] A neural strategy for the inference of SH3 domain-peptide interaction specificity
    Ferraro, E
    Via, A
    Ausiello, G
    Helmer-Citterich, M
    [J]. BMC BIOINFORMATICS, 2005, 6 (Suppl 4)
  • [3] A novel gene encoding an SH3 domain protein is mutated in nephronophthisis type 1
    Hildebrandt, F
    Otto, E
    Rensing, C
    Nothwang, HG
    Vollmer, M
    Adolphs, J
    Hanusch, H
    Brandis, M
    [J]. NATURE GENETICS, 1997, 17 (02) : 149 - 153
  • [4] A novel gene encoding an SH3 domain protein is mutated in nephronophthisis type 1
    Friedhelm Hildebrandt
    Edgar Otto
    Cornelia Rensing
    Hans Gerd Nothwang
    Martin Vollmer
    Jörn Adolphs
    Helge Hanusch
    Matthias Brandis
    [J]. Nature Genetics, 1997, 17 : 149 - 153
  • [5] Machine-learning scoring functions for structure-based virtual screening
    Li Hongjian
    Sze, Kam-Heung
    Lu Gang
    Ballester, Pedro J.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE, 2021, 11 (01)
  • [6] A practical guide to machine-learning scoring for structure-based virtual screening
    Viet-Khoa Tran-Nguyen
    Muhammad Junaid
    Saw Simeon
    Pedro J. Ballester
    [J]. Nature Protocols, 2023, 18 : 3460 - 3511
  • [7] Machine-learning scoring functions for structure-based drug lead optimization
    Li, Hongjian
    Sze, Kam-Heung
    Lu, Gang
    Ballester, Pedro J.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE, 2020, 10 (05)
  • [8] Performance of machine-learning scoring functions in structure-based virtual screening
    Wojcikowski, Maciej
    Ballester, Pedro J.
    Siedlecki, Pawel
    [J]. SCIENTIFIC REPORTS, 2017, 7
  • [9] A practical guide to machine-learning scoring for structure-based virtual screening
    Tran-Nguyen, Viet-Khoa
    Junaid, Muhammad
    Simeon, Saw
    Ballester, Pedro J.
    [J]. NATURE PROTOCOLS, 2023, 18 (11) : 3460 - 3511
  • [10] Performance of machine-learning scoring functions in structure-based virtual screening
    Maciej Wójcikowski
    Pedro J. Ballester
    Pawel Siedlecki
    [J]. Scientific Reports, 7