Protein Secondary Structural Class Prediction using Effective Feature Modeling and Machine Learning Techniques

被引:7
|
作者
Bankapur, Sanjay [1 ]
Patil, Nagamma [1 ]
机构
[1] Natl Inst Technol Karnataka, Dept Informat Technol, Mangalore, India
关键词
amino acid sequence; bi-gram; character embedding; machine learning; protein secondary structural sequence; skip-gram; AMINO-ACID-COMPOSITION;
D O I
10.1109/BIBE.2018.00012
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Protein Secondary Structural Class (PSSC) prediction is an important step to find its further folds, tertiary structure and functions, which in turn have potential applications in drug discovery. Various computational methods have been developed to predict the PSSC, however, predicting PSSC on the basis of protein sequences is still a challenging task. In this study, we propose an effective approach to extract features using two techniques (i) SkipXGram bi-gram: in which skipped bi-gram features are extracted and (ii) Character embedded features: in which features are extracted using word embedding approach. The combined feature sets from the proposed feature modeling approach are explored using various machine learning classifiers. The best performing classifier (i.e. Random Forest) is benchmarked against state-of-the-art PSSC prediction models. The proposed model was assessed on two low sequence similarity benchmark datasets i.e. 25PDB and FC699. The performance analysis demonstrates that the proposed model consistently outperformed state-of-the-art models by a factor of 3% to 23% and 4% to 6% for 25PDB and FC699 datasets respectively.
引用
收藏
页码:18 / 21
页数:4
相关论文
共 50 条
  • [1] Enhanced Protein Structural Class Prediction Using Effective Feature Modeling and Ensemble of Classifiers
    Bankapur, Sanjay
    Patil, Nagamma
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (06) : 2409 - 2419
  • [2] Antiprotozoal peptide prediction using machine learning with effective feature selection techniques
    Periwal, Neha
    Arora, Pooja
    Thakur, Ananya
    Agrawal, Lakshay
    Goyal, Yash
    Rathore, Anand S.
    Anand, Harsimrat Singh
    Kaur, Baljeet
    Sood, Vikas
    [J]. HELIYON, 2024, 10 (16)
  • [3] An effective feature extraction method on protein secondary structure class prediction
    Liu, Lizhen
    Yin, Ruxi
    Song, Wei
    Du, Chao
    [J]. Journal of Bionanoscience, 2017, 11 (05): : 446 - 454
  • [4] An empirical comparison of individual machine learning techniques and ensemble approaches in protein structural class prediction
    Bittencourt, VG
    Abreu, MCC
    de Souto, MCP
    Canuto, AMDP
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 527 - 531
  • [5] Protein secondary structure prediction using machine learning
    Zhang, BF
    Chen, ZH
    Murphey, YL
    [J]. Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vols 1-5, 2005, : 532 - 537
  • [6] Protein Secondary Structure Prediction Using Machine Learning
    Saha, Sriparna
    Ekbal, Asif
    Sharma, Sidharth
    Bandyopadhyay, Sanghamitra
    Maulik, Ujjwal
    [J]. INTELLIGENT INFORMATICS, 2013, 182 : 57 - +
  • [7] Protein Disorder Prediction Using Machine Learning Techniques
    Balto, Badee
    Munshi, Amr
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (03): : 575 - 579
  • [8] An Effective Disease Prediction Algorithms Using Machine Learning Techniques
    Sirivanth, Paladugu
    Rao, N. V. Krishna
    Manduva, Jenvith
    Thirupathi, J.
    Kavya, S. P., V
    Tejaswini, M.
    Sruthi, K. Sai
    [J]. PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 502 - 507
  • [9] Effective Heart Disease Prediction Using Machine Learning Techniques
    Bhatt, Chintan M.
    Patel, Parth
    Ghetia, Tarang
    Mazzeo, Pier Luigi
    [J]. ALGORITHMS, 2023, 16 (02)
  • [10] Machine learning techniques for protein secondary structure prediction: An overview and evaluation
    Yoo, Paul D.
    Zhou, Bing Bing
    Zomaya, Albert Y.
    [J]. CURRENT BIOINFORMATICS, 2008, 3 (02) : 74 - 86