IDPpred: a new sequence-based predictor for identification of intrinsically disordered protein with enhanced accuracy

被引:1
|
作者
Chaurasiya, Deepak [1 ]
Mondal, Rajkrishna [2 ]
Lahiri, Tapobrata [1 ]
Tripathi, Asmita [1 ]
Ghinmine, Tejas [1 ]
机构
[1] Indian Inst Informat Technol, Dept Appl Sci, Prayagraj, Uttar Pradesh, India
[2] Nagaland Univ, Dept Biotechnol, Dimapur, Nagaland, India
来源
关键词
Intrinsically disordered protein; numerical representation of sequence; periodicity count value and predictor; REGIONS;
D O I
10.1080/07391102.2023.2290615
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Discovery of intrinsically disordered proteins (IDPs) and protein hybrids that contain both intrinsically disordered protein regions (IDPRs) along with ordered regions has changed the sequence-structure-function paradigm of protein. These proteins with lack of persistently fixed structure are often found in all organisms and play vital roles in various biological processes. Some of them are considered as potential drug targets due to their overrepresentation in pathophysiological processes. The major bottlenecks for characterizing such proteins are their occasional overexpression, difficulty in getting purified homogeneous form and the challenge of investigating them experimentally. Sequence-based prediction of intrinsic disorder remains a useful strategy especially for many large-scale proteomic investigations. However, worst accuracy still occurs for short disordered regions with less than ten residues, for the residues close to order-disorder boundaries, for regions that undergo coupled folding and binding in presence of partner, and for prediction of fully disordered proteins. Annotation of fully disordered proteins mostly relies on the far-UV circular dichroism experiment which gives overall secondary structure composition without residue-level resolution. Current methods including that using secondary structure information failed to predict half of target IDPs correctly in the recent Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment. This study utilized profiles of random sequential appearance of physicochemical properties of amino acids and random sequential appearance of order and disorder promoting amino acids in protein together with the existing CIDER feature for the prediction of IDP from sequence input. Our method was found to significantly outperform the existing predictors across different datasets.
引用
收藏
页码:957 / 965
页数:9
相关论文
共 50 条
  • [31] Direct prediction of intrinsically disordered protein conformational properties from sequence
    Lotthammer, Jeffrey M.
    Ginell, Garrett M.
    Griffith, Daniel
    Emenecker, Ryan
    Holehouse, Alex S.
    BIOPHYSICAL JOURNAL, 2024, 123 (03) : 43A - 43A
  • [32] Sequence-Based Prediction of Protein Solubility
    Agostini, Federico
    Vendruscolo, Michele
    Tartaglia, Gian Gaetano
    JOURNAL OF MOLECULAR BIOLOGY, 2012, 421 (2-3) : 237 - 241
  • [33] Sequence heuristics to encode phase behaviour in intrinsically disordered protein polymers
    Felipe García Quiroz
    Ashutosh Chilkoti
    Nature Materials, 2015, 14 : 1164 - 1171
  • [34] POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictor
    Preto, Antonio J.
    Caniceiro, Ana B.
    Duarte, Francisco
    Fernandes, Hugo
    Ferreira, Lino
    Mourao, Joana
    Moreira, Irina S.
    JOURNAL OF CHEMINFORMATICS, 2024, 16 (01)
  • [35] POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictor
    António J. Preto
    Ana B. Caniceiro
    Francisco Duarte
    Hugo Fernandes
    Lino Ferreira
    Joana Mourão
    Irina S. Moreira
    Journal of Cheminformatics, 16
  • [36] Towards the unification of sequence-based classification and sequence-based identification of host-associated microorganisms
    Herr, Joshua R.
    Oepik, Maarja
    Hibbett, David S.
    NEW PHYTOLOGIST, 2015, 205 (01) : 27 - 31
  • [37] NUCLEOTIDE SEQUENCE-BASED APPROACHES TO HERBAL IDENTIFICATION
    Cimino, Matthew T.
    PHARMACEUTICAL BIOLOGY, 2009, 47 : 19 - 19
  • [38] Speech identification using a sequence-based heuristic
    Heinrich, G
    Proceedings ELMAR-2005, 2005, : 225 - 228
  • [39] Identification of Intrinsically Disordered Protein Regions Based on Deep Neural Network-VGG16
    Xu, Pengchang
    Zhao, Jiaxiang
    Zhang, Jie
    ALGORITHMS, 2021, 14 (04)
  • [40] Intrinsically Disordered Proteins: the New Sequence-Structure-Function Relations
    Huang Yong-Qi
    Liu Zhi-Rong
    ACTA PHYSICO-CHIMICA SINICA, 2010, 26 (08) : 2061 - 2072