Extracting features from protein sequences to improve deep extreme learning machine for protein fold recognition

被引:14
|
作者
Ibrahim, Wisam [1 ]
Abadeh, Mohammad Saniee [1 ]
机构
[1] Tarbiat Modares Univ, Fac Elect & Comp Engn, Tehran, Iran
关键词
Protein fold recognition; Extreme learning machine; Protein descriptor; Feature extraction; AMINO-ACID-COMPOSITION; LYSINE SUCCINYLATION SITES; GENERAL-FORM; ENSEMBLE CLASSIFIER; STRUCTURAL CLASSES; SUBCELLULAR-LOCALIZATION; PSEUDO COMPONENTS; DIFFERENT MODES; K-TUPLE; PREDICTION;
D O I
10.1016/j.jtbi.2017.03.023
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Protein fold recognition is an important problem in bioinformatics to predict three-dimensional structure of a protein. One of the most challenging tasks in protein fold recognition problem is the extraction of efficient features from the amino-acid sequences to obtain better classifiers. In this paper, we have proposed six descriptors to extract features from protein sequences. These descriptors are applied in the first stage of a three-stage framework PCA-DELM-LDA to extract feature vectors from the amino-acid sequences. Principal Component Analysis PCA has been implemented to reduce the number of extracted features. The extracted feature vectors have been used with original features to improve the performance of the Deep Extreme Learning Machine DELM in the second stage. Four new features have been extracted from the second stage and used in the third stage by Linear Discriminant Analysis LDA to classify the instances into 27 folds. The proposed framework is implemented on the independent and combined feature sets in SCOP datasets. The experimental results show that extracted feature vectors in the first stage could improve the performance of DELM in extracting new useful features in second stage. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [31] Identify protein disorder from amino acid sequences with Machine learning
    Iyer, Shrinath
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY (EIT), 2021, : 429 - 436
  • [32] Machine learning methods for predicting protein structure from single sequences
    Kandathil, Shaun M.
    Lau, Andy M.
    Jones, David T.
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 2023, 81
  • [33] DeepFunc: A Deep Learning Framework for Accurate Prediction of Protein Functions from Protein Sequences and Interactions
    Zhang, Fuhao
    Song, Hong
    Zeng, Min
    Li, Yaohang
    Kurgan, Lukasz
    Li, Min
    [J]. PROTEOMICS, 2019, 19 (12)
  • [34] Protein fold classification based on machine learning paradigm - A review
    Silesian University of Technology, Institute of Computer Science, Akademicka 16, Gliwice
    44-100, Poland
    [J]. Bio-Algorithms Med-Syst., 1
  • [36] An Overview on Protein Fold Classification via Machine Learning Approach
    Tian, Xiaoyu
    Chen, Daozheng
    Gao, Jun
    [J]. CURRENT PROTEOMICS, 2018, 15 (02) : 85 - 98
  • [37] Protein sequence classification using extreme learning machine
    Wang, DH
    Huang, GB
    [J]. Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vols 1-5, 2005, : 1406 - 1411
  • [38] Structural protein fold recognition based on secondary structure and evolutionary information using machine learning algorithms
    Qin, Xinyi
    Liu, Min
    Zhang, Lu
    Liu, Guangzhong
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2021, 91
  • [39] Prediction of apoptosis protein subcellular localization via heterogeneous features and hierarchical extreme learning machine
    Zhang, S.
    Zhang, T.
    Liu, C.
    [J]. SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2019, 30 (03) : 209 - 228
  • [40] Deep motion templates and extreme learning machine for sign language recognition
    Imran, Javed
    Raman, Balasubramanian
    [J]. VISUAL COMPUTER, 2020, 36 (06): : 1233 - 1246