Accurate prediction of essential proteins using ensemble machine learning

被引:0
|
作者
Lu, Dezhi [1 ]
Wu, Hao [1 ]
Hou, Yutong [2 ]
Wu, Yuncheng [3 ]
Liu, Yuanyuan [1 ]
Wang, Jinwu [1 ,2 ]
机构
[1] Shanghai Univ, Sch Med, Shanghai 200444, Peoples R China
[2] Shanghai Jiao Tong Univ, Shanghai Peoples Hosp 9, Dept Orthopaed Surg, Shanghai Key Lab Orthopaed Implants,Sch Med, Shanghai 200011, Peoples R China
[3] Univ Shanghai Sci & Technol, Shanghai 200093, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金; 国家重点研发计划;
关键词
protein-protein interaction (PPI); essential proteins; deep learning; ensemble learning; NETWORK; FRAMEWORK;
D O I
10.1088/1674-1056/ad8db2
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Essential proteins are crucial for biological processes and can be identified through both experimental and computational methods. While experimental approaches are highly accurate, they often demand extensive time and resources. To address these challenges, we present a computational ensemble learning framework designed to identify essential proteins more efficiently. Our method begins by using node2vec to transform proteins in the protein-protein interaction (PPI) network into continuous, low-dimensional vectors. We also extract a range of features from protein sequences, including graph-theory-based, information-based, compositional, and physiochemical attributes. Additionally, we leverage deep learning techniques to analyze high-dimensional position-specific scoring matrices (PSSMs) and capture evolutionary information. We then combine these features for classification using various machine learning algorithms. To enhance performance, we integrate the outputs of these algorithms through ensemble methods such as voting, weighted averaging, and stacking. This approach effectively addresses data imbalances and improves both robustness and accuracy. Our ensemble learning framework achieves an AUC of 0.960 and an accuracy of 0.9252, outperforming other computational methods. These results demonstrate the effectiveness of our approach in accurately identifying essential proteins and highlight its superior feature extraction capabilities.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins
    Martelli, Pier Luigi
    Fariselli, Piero
    Casadio, Rita
    BIOINFORMATICS, 2003, 19 : i205 - i211
  • [32] Accurate Performance and Power Prediction for FPGAs Using Machine Learning
    Sawalha, Lina
    Abuaita, Tawfiq
    Cowley, Martin
    Akhmatdinov, Sergei
    Dubs, Adam
    2022 IEEE 30TH INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2022), 2022, : 228 - 228
  • [33] Accurate prediction of Snare Protein Sequence using Machine Learning
    Talpur, Dani Bux
    Shaikh, Salahuddin
    Khowaja, Ashfaque
    Adnan, Saifullah
    Ghulam, Ali
    BIOSCIENCE RESEARCH, 2022, 19 (03): : 1414 - 1422
  • [34] Prediction of Essential Genes in Comparison States Using Machine Learning
    Xie, Jiang
    Zhao, Chang
    Sun, Jiamin
    Li, Jiaxin
    Yang, Fuzhang
    Wang, Jiao
    Nie, Qing
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (05) : 1784 - 1792
  • [35] SCORPION is a stacking-based ensemble learning framework for accurate prediction of phage virion proteins
    Saeed Ahmad
    Phasit Charoenkwan
    Julian M. W. Quinn
    Mohammad Ali Moni
    Md Mehedi Hasan
    Pietro Lio’
    Watshara Shoombuatong
    Scientific Reports, 12 (1)
  • [36] SCORPION is a stacking-based ensemble learning framework for accurate prediction of phage virion proteins
    Ahmad, Saeed
    Charoenkwan, Phasit
    Quinn, Julian M. W.
    Moni, Mohammad Ali
    Hasan, Md Mehedi
    Lio, Pietro
    Shoombuatong, Watshara
    SCIENTIFIC REPORTS, 2022, 12 (01):
  • [37] Enhanced slope stability prediction using ensemble machine learning techniques
    Yadav, Devendra Kumar
    Chattopadhyay, Swarup
    Tripathy, Debi Prasad
    Mishra, Pragyan
    Singh, Pritiranjan
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [38] Performance prediction of impact hammer using ensemble machine learning techniques
    Ocak, Ibrahim
    Seker, Sadi Evren
    Rostami, Jamal
    TUNNELLING AND UNDERGROUND SPACE TECHNOLOGY, 2018, 80 : 269 - 276
  • [39] Startup Unicorn Success Prediction Using Ensemble Machine Learning Algorithm
    Reddy, Sattaru Harshavardhan
    Bathini, Hemanth
    Ajmeera, Vamshi Nayak
    Marella, Revanth Sai
    Kumar, T. V. Vijay
    Khari, Manju
    INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2023, PT II, 2024, 14532 : 330 - 338
  • [40] Prediction of cryptocurrency's price using ensemble machine learning algorithms
    Balijepalli, N. S. S. Kiranmai
    Thangaraj, Viswanathan
    EUROPEAN JOURNAL OF MANAGEMENT AND BUSINESS ECONOMICS, 2025,