Identification of hot regions in protein-protein interactions by sequential pattern mining

被引:34
|
作者
Hsu, Chen-Ming
Chen, Chien-Yu [1 ]
Liu, Baw-Jhiune
Huang, Chih-Chang
Laio, Min-Hung
Lin, Chien-Chieh
Wu, Tzung-Lin
机构
[1] Yuan Ze Univ, Dept Comp Sci & Engn, Chungli 320, Taiwan
[2] Natl Taiwan Univ, Dept Bioind Mechatron Engn, Taipei 106, Taiwan
关键词
D O I
10.1186/1471-2105-8-S5-S8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Identification of protein interacting sites is an important task in computational molecular biology. As more and more protein sequences are deposited without available structural information, it is strongly desirable to predict protein binding regions by their sequences alone. This paper presents a pattern mining approach to tackle this problem. It is observed that a functional region of protein structures usually consists of several peptide segments linked with large wildcard regions. Thus, the proposed mining technology considers large irregular gaps when growing patterns, in order to find the residues that are simultaneously conserved but largely separated on the sequences. A derived pattern is called a cluster-like pattern since the discovered conserved residues are always grouped into several blocks, which each corresponds to a local conserved region on the protein sequence. Results: The experiments conducted in this work demonstrate that the derived long patterns automatically discover the important residues that form one or several hot regions of protein-protein interactions. The methodology is evaluated by conducting experiments on the web server MAGIIC-PRO based on a well known benchmark containing 220 protein chains from 72 distinct complexes. Among the tested 218 proteins, there are 900 sequential blocks discovered, 4.25 blocks per protein chain on average. About 92% of the derived blocks are observed to be clustered in space with at least one of the other blocks, and about 66% of the blocks are found to be near the interface of protein-protein interactions. It is summarized that for about 83% of the tested proteins, at least two interacting blocks can be discovered by this approach. Conclusion: This work aims to demonstrate that the important residues associated with the interface of protein-protein interactions may be automatically discovered by sequential pattern mining. The detected regions possess high conservation and thus are considered as the computational hot regions. This information would be useful to characterizing protein sequences, predicting protein function, finding potential partners, and facilitating protein docking for drug discovery.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Integrating protein-protein interactions and text mining for protein function prediction
    Jaeger, Samira
    Gaudan, Sylvain
    Leser, Ulf
    Rebholz-Schuhmann, Dietrich
    BMC BIOINFORMATICS, 2008, 9 (Suppl 8)
  • [32] Integrating protein-protein interactions and text mining for protein function prediction
    Samira Jaeger
    Sylvain Gaudan
    Ulf Leser
    Dietrich Rebholz-Schuhmann
    BMC Bioinformatics, 9
  • [33] Fuzzy regions in an intrinsically disordered protein impair protein-protein interactions
    Gruet, Antoine
    Dosnon, Marion
    Blocquel, David
    Brunel, Joanna
    Gerlier, Denis
    Das, Rahul K.
    Bonetti, Daniela
    Gianni, Stefano
    Fuxreiter, Monika
    Longhi, Sonia
    Bignon, Christophe
    FEBS JOURNAL, 2016, 283 (04) : 576 - 594
  • [34] Document classification for mining host pathogen protein-protein interactions
    Yin, Lanlan
    Xu, Guixian
    Torii, Manabu
    Niu, Zhendong
    Maisog, Jose M.
    Wu, Cathy
    Hu, Zhangzhi
    Liu, Hongfang
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2010, 49 (03) : 155 - 160
  • [35] PPI Finder: A Mining Tool for Human Protein-Protein Interactions
    He, Min
    Wang, Yi
    Li, Wei
    PLOS ONE, 2009, 4 (02):
  • [36] DAPPER: a data-mining resource for protein-protein interactions
    Haider, Syed
    Lipinszki, Zoltan
    Przewloka, Marcin R.
    Ladak, Yaseen
    D'Avino, Pier Paolo
    Kimata, Yuu
    Lio, Pietro
    Glover, David M.
    BIODATA MINING, 2015, 8
  • [37] DAPPER: a data-mining resource for protein-protein interactions
    Syed Haider
    Zoltan Lipinszki
    Marcin R. Przewloka
    Yaseen Ladak
    Pier Paolo D’Avino
    Yuu Kimata
    Pietro Lio’
    David M. Glover
    BioData Mining, 8
  • [38] Document Classification for Mining Host Pathogen Protein-Protein Interactions
    Xu, Guixian
    Yin, Lanlan
    Torii, Manabu
    Niu, Zhendong
    Wu, Cathy
    Hu, Zhangzhi
    Liu, Hongfang
    2008 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, PROCEEDINGS, 2008, : 461 - +
  • [39] Hot spot prediction in protein-protein interactions by an ensemble system
    Liu, Quanya
    Chen, Peng
    Wang, Bing
    Zhang, Jun
    Li, Jinyan
    BMC SYSTEMS BIOLOGY, 2018, 12
  • [40] Strategies and methods in the identification of antagonists of protein-protein interactions
    Gadek, TR
    BIOTECHNIQUES, 2003, : 21 - 24