共 31 条
Identification of Intrinsically Disordered Proteins and Regions by Length-Dependent Predictors Based on Conditional Random Fields
被引:13
|作者:
Liu, Yumeng
[1
]
Chen, Shengyu
[2
]
Wang, Xiaolong
[1
]
Liu, Bin
[1
,3
,4
]
机构:
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518055, Peoples R China
[2] Indiana Univ, Sch Informat Comp & Engn, Bloomington, IN 47408 USA
[3] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
[4] Beijing Inst Technol, Adv Res Inst Multidisciplinary Sci, Beijing 100081, Peoples R China
来源:
基金:
中国国家自然科学基金;
关键词:
ACCURATE PREDICTION;
UNSTRUCTURED REGIONS;
SEQUENCE;
SERVER;
D O I:
10.1016/j.omtn.2019.06.004
中图分类号:
R-3 [医学研究方法];
R3 [基础医学];
学科分类号:
1001 ;
摘要:
Accurate identification of intrinsically disordered proteins/regions (IDPs/IDRs) is critical for predicting protein structure and function. Previous studies have shown that IDRs of different lengths have different characteristics, and several classification-based predictors have been proposed for predicting different types of IDRs. Compared with these classification-based predictors, the previously proposed predictor IDP-CRF exhibits state-of-the-art performance for predicting IDPs/IDRs, which is a sequence labeling model based on conditional random fields (CRFs). Motivated by these methods, we propose a predictor called IDP-FSP, which is an ensemble of three CRF-based predictors called IDP-FSP-L, IDP-FSP-S, and IDP-FSP-G. These three predictors are specially designed to predict long, short, and generic disordered regions, respectively, and they are constructed based on different features. To the best of our knowledge, IDP-FSP is the first predictor that combines a sequence labeling algorithm with IDRs of different lengths. Experimental results using two independent test datasets show that IDP-FSP achieves better or at least comparable predictive performance with 26 existing state-of-the-art methods in this field, proving the effectiveness of IDP-FSP.
引用
收藏
页码:396 / 404
页数:9
相关论文