Mining of protein-protein interfacial residues from massive protein sequential and spatial data

被引:5
|
作者
Wang, Debby D. [1 ]
Zhou, Weiqiang [1 ]
Yan, Hong [1 ]
机构
[1] City Univ Hong Kong, Dept Elect Engn, Kowloon, Hong Kong, Peoples R China
关键词
Protein-protein interface prediction; 3D alpha shape modeling; Residue sequence profile; Joint mutual information (JMI); Neuro-fuzzy classifiers (NFCs); Neighborhood classifiers (NECs); CART; Extreme learning machines (ELMs); Naive Bayesian classifiers (NBCs); BIG DATA; INTERACTION SITES; DATA-BANK; INFORMATION; PREDICTION; NETWORK;
D O I
10.1016/j.fss.2014.01.017
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
It is a great challenge to process big data in bioinformatics. In this paper, we addressed the problem of identifying protein-protein interfacial residues from massive protein structural data. A protein set, comprising 154993 residues, was analyzed. We applied the three-dimensional alpha shape modeling to the search of surface and interfacial residues in this set, and adopted the spatially neighboring residue profiles to characterize each residue. These residue profiles, which revealed the sequential and spatial information of proteins, translated the original data into a large matrix. After vertically and horizontally refining this matrix, we comparably implemented a series of popular learning procedures, including neuro-fuzzy classifiers (NFCs), CART, neighborhood classifiers (NECs), extreme learning machines (ELMs) and naive Bayesian classifiers (NBCs), to predict the interfacial residues, aiming to investigate the sensitivity of these massive structural data to different learning mechanisms. As a consequence, ELMs, CART and NFCs performed better in terms of computational costs; NFCs, NBCs and ELMs provided favorable prediction accuracies. Overall, NFCs, NBCs and ELMs are favourable choices for fastly and accurately handling this type of data. More importantly, the marginal differences between the prediction performances of these methods imply the insensitivity of this type of data to different learning mechanisms. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:101 / 116
页数:16
相关论文
共 50 条
  • [11] Mining the Protein Data Bank to improve prediction of changes in protein-protein binding
    Flores, Samuel Coulbourn
    Alexiou, Athanasios
    Glaros, Anastasios
    PLOS ONE, 2021, 16 (11):
  • [12] Application of data mining techniques to protein-protein interaction prediction
    Kocatas, A
    Gursoy, A
    Atalay, R
    COMPUTER AND INFORMATION SCIENCES - ISCIS 2003, 2003, 2869 : 316 - 323
  • [13] PPLook: an automated data mining tool for protein-protein interaction
    Zhang, Shao-Wu
    Li, Yao-Jun
    Xia, Li
    Pan, Quan
    BMC BIOINFORMATICS, 2010, 11
  • [14] Mining Protein-Protein Interactions from GeneRIFs with OpenDMAP
    Fox, Andrew D.
    Baumgartner, William A., Jr.
    Johnson, Helen L.
    Hunter, Lawrence E.
    Slonim, Donna K.
    LINKING LITERATURE, INFORMATION, AND KNOWLEDGE FOR BIOLOGY, 2010, 6004 : 43 - +
  • [15] Mining physical protein-protein interactions from the literature
    Huang, Minlie
    Ding, Shilin
    Wang, Hongning
    Zhu, Xiaoyan
    GENOME BIOLOGY, 2008, 9
  • [16] DAPPER: a data-mining resource for protein-protein interactions
    Haider, Syed
    Lipinszki, Zoltan
    Przewloka, Marcin R.
    Ladak, Yaseen
    D'Avino, Pier Paolo
    Kimata, Yuu
    Lio, Pietro
    Glover, David M.
    BIODATA MINING, 2015, 8
  • [17] PPLook: an automated data mining tool for protein-protein interaction
    Shao-Wu Zhang
    Yao-Jun Li
    Li Xia
    Quan Pan
    BMC Bioinformatics, 11
  • [18] DAPPER: a data-mining resource for protein-protein interactions
    Syed Haider
    Zoltan Lipinszki
    Marcin R. Przewloka
    Yaseen Ladak
    Pier Paolo D’Avino
    Yuu Kimata
    Pietro Lio’
    David M. Glover
    BioData Mining, 8
  • [19] Mining Impact of Protein Modifications on Protein-Protein Interactions from Literature
    Siu, Amy
    Arighi, Cecilia
    Nchoutmboube, Jules
    Tudor, Catalina O.
    Vijay-Shanker, K.
    Wu, Cathy H.
    BIBMW: 2009 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOP, 2009, : 343 - 343
  • [20] Mining physical protein-protein interactions from the literature
    Huang M.
    Ding S.
    Wang H.
    Zhu X.
    Genome Biology, 9 (Suppl 2)