pQLyCar: Peptide-based dynamic query-driven sample rescaling strategy for identifying carboxylation sites combined with KNN and SVM

被引:1
|
作者
Ning, Qiao [1 ]
Deng, Ansheng [1 ]
Zou, Tingting [1 ]
Zhao, Xiaowei [2 ]
机构
[1] Dalian Maritime Univ, Dept Informat Sci & Technol, Dalian 116026, Peoples R China
[2] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun 130117, Peoples R China
关键词
Carboxylation; KNN algorithm; Peptide-based dynamic query-driven sample  rescaling strategy; Information gain; SUCCINYLATION SITES; PROTEIN; PREDICTION; SEQUENCES;
D O I
10.1016/j.ab.2021.114386
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Lysine carboxylation is one of the most crucial type of post-translation modification, which plays a significant role in catalytic mechanisms. Therefore, it is essential to study lysine carboxylation and explore its biological mechanism. Compared with traditional experimental methods that are labor-intensive and time-consuming, computational methods are much more convenience and faster. Therefore, it is urgent to establish an accurate carboxylation identification model. Herein we proposed a method, named pQLyCar for identification of lysine carboxylation using SVM as classifier. In pQLyCar, a peptide-based dynamic query-driven sample rescaling strategy (pDQD-SR) is proposed to address the class imbalance of training data, which builds a specific prediction model for each query sample. KNN algorithm calculates distance between samples according to original sequences instead of feature vectors. Information entropy is applied to select optimal size of sliding window and various types of sequence-and position-based features are incorporated for construction of feature space, including residues composition (RC), K-space and position-special amino acid propensity (PSAAP). Finally, the performance of pQLyCar is measured with a specificity of 96.49% and a sensibility of 99.59% using jackknife test method, which indicated that pQLyCar method can be a useful tool for prediction of lysine carboxylation sites.
引用
收藏
页数:8
相关论文
共 1 条
  • [1] KNN-based dynamic query-driven sample rescaling strategy for class imbalance learning
    Hu, Jun
    Li, Yang
    Yan, Wu-Xia
    Yang, Jing-Yu
    Shen, Hong-Bin
    Yu, Dong-Jun
    NEUROCOMPUTING, 2016, 191 : 363 - 373