Extracting features from protein sequences to improve deep extreme learning machine for protein fold recognition

被引:14
|
作者
Ibrahim, Wisam [1 ]
Abadeh, Mohammad Saniee [1 ]
机构
[1] Tarbiat Modares Univ, Fac Elect & Comp Engn, Tehran, Iran
关键词
Protein fold recognition; Extreme learning machine; Protein descriptor; Feature extraction; AMINO-ACID-COMPOSITION; LYSINE SUCCINYLATION SITES; GENERAL-FORM; ENSEMBLE CLASSIFIER; STRUCTURAL CLASSES; SUBCELLULAR-LOCALIZATION; PSEUDO COMPONENTS; DIFFERENT MODES; K-TUPLE; PREDICTION;
D O I
10.1016/j.jtbi.2017.03.023
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Protein fold recognition is an important problem in bioinformatics to predict three-dimensional structure of a protein. One of the most challenging tasks in protein fold recognition problem is the extraction of efficient features from the amino-acid sequences to obtain better classifiers. In this paper, we have proposed six descriptors to extract features from protein sequences. These descriptors are applied in the first stage of a three-stage framework PCA-DELM-LDA to extract feature vectors from the amino-acid sequences. Principal Component Analysis PCA has been implemented to reduce the number of extracted features. The extracted feature vectors have been used with original features to improve the performance of the Deep Extreme Learning Machine DELM in the second stage. Four new features have been extracted from the second stage and used in the third stage by Linear Discriminant Analysis LDA to classify the instances into 27 folds. The proposed framework is implemented on the independent and combined feature sets in SCOP datasets. The experimental results show that extracted feature vectors in the first stage could improve the performance of DELM in extracting new useful features in second stage. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [41] Deep motion templates and extreme learning machine for sign language recognition
    Javed Imran
    Balasubramanian Raman
    [J]. The Visual Computer, 2020, 36 : 1233 - 1246
  • [42] The language of proteins: NLP, machine learning & protein sequences
    Ofer, Dan
    Brandes, Nadav
    Linial, Michal
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 1750 - 1758
  • [43] Deep Features with Improved Extreme Learning Machine for Breast Cancer Classification
    Chakravarthy, Sannasi S. R.
    Rajaguru, Harikumar
    [J]. 2021 8TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE (ISCMI 2021), 2021, : 237 - 241
  • [44] A Protein Block Based Fold Recognition Method for the Annotation of Twilight Zone Sequences
    Suresh, V.
    Ganesan, K.
    Parthasarathy, S.
    [J]. PROTEIN AND PEPTIDE LETTERS, 2013, 20 (03): : 249 - 254
  • [45] Prediction of prokaryotic transposases from protein features with machine learning approaches
    Wang, Qian
    Ye, Jun
    Xu, Teng
    Zhou, Ning
    Lu, Zhongqiu
    Ying, Jianchao
    [J]. MICROBIAL GENOMICS, 2021, 7 (07):
  • [46] DETERMINANTS OF A PROTEIN FOLD - UNIQUE FEATURES OF THE GLOBIN AMINO-ACID-SEQUENCES
    BASHFORD, D
    CHOTHIA, C
    LESK, AM
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1987, 196 (01) : 199 - 216
  • [47] Learning to predict protein-protein interactions from protein sequences
    Gomez, SM
    Noble, WS
    Rzhetsky, A
    [J]. BIOINFORMATICS, 2003, 19 (15) : 1875 - 1881
  • [48] Protein Remote Homology Detection and Fold Recognition based on Features Extracted from Frequency Profiles
    Lin, Lei
    Liu, Bin
    Wang, Xiaolong
    Wang, Xuan
    Tang, Buzhou
    [J]. JOURNAL OF COMPUTERS, 2011, 6 (02) : 321 - 328
  • [49] Molecular linguistics: Extracting information from gene and protein sequences
    Botstein, D
    Cherry, JM
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (11) : 5506 - 5507
  • [50] Extracting Action Sequences from Texts Based on Deep Reinforcement Learning
    Feng, Wenfeng
    Zhuo, Hankz Hankui
    Kambhampati, Subbarao
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4064 - 4070